THE
QUARTERLY JOURNAL OF ECONOMICS Vol. CXXIV
August 2009
Issue 3
SLUGGISH RESPONSES OF PRICES AND INFLATION TO MONETARY SHOCKS IN AN INVENTORY MODEL OF MONEY DEMAND∗ FERNANDO ALVAREZ ANDREW ATKESON CHRIS EDMOND We examine the responses of prices and inflation to monetary shocks in an inventory-theoretic model of money demand. We show that the price level responds sluggishly to an exogenous increase in the money stock because the dynamics of households’ money inventories leads to a partially offsetting endogenous reduction in velocity. We also show that inflation responds sluggishly to an exogenous increase in the nominal interest rate because changes in monetary policy affect the real interest rate. In a quantitative example, we show that this nominal sluggishness is substantial and persistent if inventories in the model are calibrated to match U.S. households’ holdings of M2.
I. INTRODUCTION In this paper, we examine the dynamics of money, velocity, prices, interest rates, and inflation in an inventory-theoretic model of the demand for money.1 We show that our inventorytheoretic model offers new answers to two important questions: why do prices respond sluggishly to changes in money, and why ∗ A previous draft of this paper circulated under the title “Can a Baumol–Tobin Model Account for the Short-Run Behavior of Velocity?” We would like to thank Robert Barro, Michael Dotsey, Tim Fuerst, Robert Lucas, Julio Rotemberg, and several anonymous referees for helpful comments. For financial support, Alvarez thanks the NSF and the Templeton Foundation and Atkeson thanks the NSF. 1. Traditionally, the literature on inventory-theoretic models of money demand has focused on the steady-state implications of these models for money demand (e.g., Barro [1976], Jovanovic [1982], Romer [1986], Chatterjee and Corbae [1992]). Here we examine the implications of an inventory-theoretic model of the demand for money for the dynamics of prices and inflation following a shock to money or to interest rates. C 2009 by the President and Fellows of Harvard College and the Massachusetts Institute of
Technology. The Quarterly Journal of Economics, August 2009
911
912
QUARTERLY JOURNAL OF ECONOMICS
does inflation respond sluggishly to changes in the short-term nominal interest rate? We first show analytically how prices and inflation are both sluggish in our model, even though price setting is fully flexible. We then show through a quantitative example that this sluggishness is substantial and persistent when our inventory-theoretic model is interpreted as applying to a broad monetary aggregate like M2. Our model is inspired by the analyses of money demand developed by Baumol (1952) and Tobin (1956). In their models, households carry money (despite the fact that money is dominated in rate of return by interest-bearing assets) because they face a fixed cost of trading money and these other assets. Our model is a simplified version of their framework. We study a cash-in-advance model with physically separated asset and goods markets. Households have two financial accounts: a brokerage account in the asset market in which they hold a portfolio of interest-bearing assets and a bank account in the goods market in which they hold money to pay for consumption. We assume that households do not have the opportunity to exchange funds between their brokerage and bank accounts every period. Instead, we assume they have the opportunity to transfer funds between accounts only once every N ≥ 1 periods. Hence, households maintain inventories of money in their bank accounts large enough to pay for consumption expenditures for several periods. They replenish these inventories with transfers from their brokerage accounts once every N periods. As households optimally manage these inventories, their money holdings follow a sawtooth pattern—rising rapidly with each periodic transfer from their brokerage account and then falling slowly as these funds are spent smoothly over time—similar to the sawtooth pattern of money holdings originally derived by Baumol (1952) and Tobin (1956) and more recently by Duffie and Sun (1990) and Abel, Eberly, and Panageas (2007). Here, we focus on the implications of our model for the response of prices to a change in money growth and the response of inflation to a change in interest rates. To highlight the specific mechanisms at work, we make the stark assumptions that price setting is fully flexible and that output in the model is exogenous so that our results can easily be compared to those from a flexible-price, constant-velocity, exogenous output benchmark cash-in-advance model of the effect of monetary policy on prices and inflation. Our first result is that prices respond sluggishly to a change in money in our model, because an exogenous increase in the stock
SLUGGISH RESPONSES OF PRICES AND INFLATION
913
of money leads endogenously, through the dynamics of households’ inventories of money, to a partially offsetting decrease in the velocity of money. As a result of this endogenous fall in velocity, prices respond on impact less than one for one to the change in money. Prices respond fully only in the long run when households’ inventories of money, and hence aggregate velocity, settle back down to their steady-state values. The sluggish response of prices to a change in money in our model can thus be understood not as a consequence of a sticky price-setting policy of firms but as a simple consequence of the sluggish response of nominal expenditure to a change in money inherent in an inventory-theoretic approach to money demand. We highlight this implication of our inventory-theoretic model of money demand because a strong negative correlation between fluctuations in money and velocity can be seen clearly in U.S. data. In Figure I, we illustrate this short-run behavior of money and velocity. We plot the ratio of M2 to consumption and the consumption velocity of M2 as deviations from a trend extracted using an HPfilter. These two series are strongly negatively correlated.2 After presenting our analytical results, we examine the extent to which our model can reproduce this comovement of money and velocity in a quantitative example. The mechanism through which our model produces a negative correlation between fluctuations in money and velocity and hence sluggish prices can be understood in two steps. First, consider how aggregate velocity is determined in this inventory-theoretic model of money demand. Households at different points in the cycle of depleting and replenishing their inventories of money in their bank accounts have different propensities to spend the money that they have on hand, or, equivalently, different individual velocities of money. Households that have recently transferred funds from their brokerage accounts to their bank accounts have a large stock of money in their bank account and tend to spend this money slowly to spread their spending smoothly over the interval of time that remains before they next have the opportunity to replenish their bank account. Hence, these households have a relatively low individual velocity of money. In contrast, households that have not transferred funds from their brokerage account in 2. We used the HP-filter smoothing parameter of 34 × 1,600 = 129,600 recommended by Ravn and Uhlig (2002) for monthly data. As discussed in the Appendix, similar results are obtained using alternative measures of the short-run fluctuations in money and velocity.
914
QUARTERLY JOURNAL OF ECONOMICS
FIGURE I Short-Run Negative Correlation of Money/Real Consumption, M/c, and Velocity, v We measure the money supply M as the M2 stock from the Board of Governors of the Federal Reserve System. We measure real consumption c as personal consumption expenditure on nondurables and services from the Bureau of Economic Analysis deflated by the personal consumption expenditures chain-type price index P from the BEA. We define velocity as v ≡ Pc/M. All data are monthly 1959:1– 2006:12 and seasonally adjusted. All variables are reported in logs as deviations from an HP trend with smoothing parameter 34 × 1,600.
the recent past and anticipate having the opportunity to make such a transfer soon tend to spend the money that they have in the bank at a relatively rapid rate, and thus have a relatively high individual velocity of money. Aggregate velocity is given by the weighted average of the individual velocities of money across all households with weights determined by the distribution of money across households. Now consider the effects on aggregate velocity of an increase in the money supply brought about by an open market operation. In this open market operation, the government trades newly created money for interest-bearing securities, and households, on the opposite side of the transaction, trade interest-bearing securities held in their brokerage accounts for newly created money. If the
SLUGGISH RESPONSES OF PRICES AND INFLATION
915
nominal interest rate is positive, this new money is purchased only by those households that currently have the opportunity to transfer funds from their brokerage accounts to their bank accounts because these are the only households that currently have the opportunity to begin spending this money. All other households choose not to participate in the open market operation because these households would have to leave this money sitting idle in their brokerage accounts, where it would be dominated in rate of return by interest-bearing securities. Hence, as a result of this open market operation, the fraction of the money stock held by those households currently able to transfer resources from their brokerage account to their bank account rises. Because these households have a lower-than-average propensity to spend this money, aggregate velocity falls. In this way, an exogenous increase in the supply of money leads to an endogenous reduction in the aggregate velocity of money and hence a diminished, or sluggish, response of the price level. To this point we have modeled changes in monetary policy as exogenously specified changes in the money supply. It is now common to model changes in monetary policy not as exogenously specified changes in money but as exogenously specified changes in the short-term nominal interest rate. When we model monetary policy in this way, we find our second result, that expected inflation responds sluggishly to a change in the short-term nominal interest rate. To gain intuition for this result, it is useful to consider the Fisher equation to decompose any change in the nominal interest rate into its two components—a change in the real interest rate and a change in expected inflation. For example, in a standard flexible-price, constant-endowment cash-in-advance model, the real interest rate is always constant, so that, given the Fisher equation, any change in the nominal interest rate must always be accompanied by a matching change in expected inflation. In this sense, in this model, expected inflation must respond immediately to a change in the nominal interest rate. More generally, from the Fisher equation, if a model is to generate a sluggish response of expected inflation to a change in the nominal interest rate caused by a change in monetary policy implemented through open market operations, it must do so because those open market operations generate, at equilibrium, a change in the real interest rate that is roughly as large as the change in the nominal interest rate. In our inventory-theoretic model of money demand, money injections
916
QUARTERLY JOURNAL OF ECONOMICS
implemented through open market operations have an effect on the real interest rate because the asset market is segmented, and it is this effect of open market operations on the real interest rate that is the source of the inflation sluggishness in our model. Asset markets are segmented in our model in the sense that only those agents who currently have the opportunity to transfer money between their brokerage and bank accounts are at the margin in participating in open market operations and in determining asset prices. This asset market segmentation arises naturally in an inventory-theoretic model of the demand for money because those agents who do not have the opportunity to transfer money between the asset and goods markets have no desire to purchase money being injected into the asset market through an open market operation because they have no ability to spend that money in the current period and they find that interest-bearing bonds dominate money as a store of value in the asset market.3 Because only those agents who currently have the opportunity to transfer money from the asset market to the goods market are at the margin in trading money and bonds with the monetary authority, money injections implemented through open market operations have a disproportionate impact on the marginal utility of a dollar for these marginal investors that is manifest as a movement in real interest rates. We first illustrate the mechanisms leading to a sluggish response of prices to money and inflation to interest rates in a specification of our model that is analytically tractable. In this specification of our model, households have log utility and all of the income from selling the households’ endowments is deposited directly into the households’ brokerage accounts. With these assumptions, the model becomes analytically tractable because households in the model choose to spend their inventories of money in their bank accounts at a rate that is independent of expectations of future prices and monetary policies. We show two main results in this analytical version of our model. First, starting from a steady state in which the opportunity cost of holding money in a bank account is low, in response to a 1% exogenous increase in the money stock, on impact, the price level increases by only 1/2 of 1% because velocity falls by 1/2 of 1%. We show how this result 3. These agents choose not to participate in the open market operation as long as the short-term nominal interest rate remains positive. Note that financial intermediaries also choose not to hold money injected through open market operations as long as the short-term nominal interest rate remains positive.
SLUGGISH RESPONSES OF PRICES AND INFLATION
917
follows from the basic geometry of money holdings in an inventorytheoretic model of money demand independent of the parameters governing the length of time, in calendar time, between households’ opportunities to transfer cash between their brokerage and bank accounts. Second, also starting from a steady state, in response to a one-percentage-point exogenous change in the nominal interest rate, on impact, the real interest rate responds by one percentage point and expected inflation does not respond at all. We show that this result follows from the asset market segmentation that is inherent in an inventory-theoretic model of money demand, again independent of the parameters governing the length of time, in calendar time, between households’ opportunities to transfer cash between accounts. The parameters governing the length of time between households’ opportunities to transfer money between accounts are important, however, for our model’s implications for the persistence of price and inflation sluggishness. These parameters also determine our model’s implications for steady-state aggregate velocity—the length of calendar time between households’ opportunities to transfer money determines the size of the inventory of money that households must hold to purchase consumption. Thus, the empirical implications of our model for the sluggishness of prices and inflation are largely determined by how we define money (because that definition determines the measure of velocity and hence the magnitude of households’ cash balances). In our model, defining money comes down to answering the question: What assets correspond to those that households hold in their bank accounts, and what assets do households hold and trade less frequently in their brokerage accounts? We examine the implications of our model in a quantitative example using a broad measure of money: U.S. households’ holdings of currency, demand deposits, savings deposits, and time deposits. Here we interpret households’ bank accounts in our model as corresponding to U.S. households’ holdings of deposits in retail commercial banks4 in the data and households’ brokerage 4. In the data, retail banks correspond to a traditional conception of a commercial bank as an institution funded by consumers’ checking, savings, and small time deposits. Clark et al. (2007) provide a useful description of retail banks in our modern financial system. As they describe, “retail banking is the cluster of products and services that banks provide to consumers and small businesses through branches, the Internet, and other channels.” “Organizationally, many large banking companies have a distinct ‘retail banking’ business unit with its own management and financial reporting structure.” “In terms of products and services, deposit taking
918
QUARTERLY JOURNAL OF ECONOMICS
accounts in the model as corresponding to U.S. holdings of other financial assets outside of the retail commercial banking system in the data. In the data, U.S. households hold a large stock of deposits in retail banks, roughly 1/2 to 2/3 of the annual personal consumption expenditure. We argue for the interpretation of this broad collection of accounts in the data as corresponding to bank accounts in our model because we find in the data that U.S. households pay a large opportunity cost in terms of foregone interest to hold such accounts—on the order of 150–200 basis points. This opportunity cost is not substantially different from the opportunity cost U.S. households pay to hold a narrower definition of money such as M1. To parameterize our model to match the ratio of U.S. households’ holdings of broad money relative to personal consumption expenditure, we assume that households transfer funds between their brokerage and bank accounts very infrequently—on the order of once every one-and-one-half to three years. We argue that this assumption is not inconsistent with evidence summarized by Vissing-Jorgensen (2002) regarding the frequency with which U.S. households trade in high-yield assets. Our interpretation of a bank account used for transactions replenished by transfers from a high-yield managed portfolio of risky and riskless assets is the same as used in the models of Duffie and Sun (1990) and Abel, Eberly, and Panageas (2007). We conduct two quantitative exercises with our model. In the first, we feed into the model the shocks to the stock of M2 and aggregate consumption observed in the U.S. economy in monthly data over the past forty years and examine the model’s predictions for velocity in the short run. The model produces fluctuations in velocity that have a surprisingly high correlation of .60 with the fluctuations in velocity observed in the data. This result stands in sharp contrast to the implications of a standard cash-in-advance model (this model with N = 1). In such a model, aggregate velocity is constant regardless of the pattern of money growth. We also find that the short-run fluctuations in velocity in our model are only 40% as large as those in the data. From the finding that is the core retail banking activity on the liability side. Deposit taking includes transactions deposits, such as checking and NOW accounts, and non-transaction deposits, such as savings accounts and time deposits (CDs). Many institutions cite the critical importance of deposits, especially consumer checking account deposits, in generating and maintaining a strong retail franchise. Retail deposits provide a low-cost, stable source of funds and are an important generator of fee income. Checking accounts are also viewed as pivotal because they serve as the anchor tying customers to the bank and allow cross selling opportunities.”
SLUGGISH RESPONSES OF PRICES AND INFLATION
919
the short-run fluctuations in velocity in our model are strongly correlated with those observed in the data, we conclude that a substantial portion of the unconditional negative correlation of the ratio of money to consumption and velocity might reasonably be attributed to the response of velocity to exogenous movements in money. From the finding that the short-run fluctuations in velocity in our model are not as large as those in the data, however, we conclude that there may be other shocks to the demand for money that we have not modeled here. In our second quantitative exercise, we consider the response of money, prices, and velocity to an exogenous shock to monetary policy, modeled as an exogenous, persistent shock to the shortterm nominal interest rate similar to that estimated in the literature, which uses vector autoregressions (VARs) to draw inferences about the effects of monetary policy. The consensus in that literature is that the impulse response of inflation to a monetary policy shock is sluggish.5 In our model, we find that the impulse response of inflation is also quite sluggish, as are the responses of money and the price level. All three of these responses from our model are quite similar to the estimated responses of these variables in this VAR literature. Although our model is incomplete in that we have assumed for simplicity that output is exogenous, these findings suggest that our model can account for a substantial portion of the sluggish responses of nominal variables to a change in the nominal interest rate. Our model is related to a growing literature on segmented asset markets. Grossman and Weiss (1983) and Rotemberg (1984) were the first to point out that open market operations could have effects on real interest rates and a delayed impact on the price level in inventory-theoretic models of money demand. The models they present are similar to this model when the parameter N = 2. Those authors examine the impact of a surprise money injection in the context of otherwise deterministic models. Here we study a fully stochastic model as in Alvarez and Atkeson (1997). That model is similar to the one presented here in that agents have separate financial accounts in asset and goods markets and cannot transfer funds between these accounts in every period. In that earlier paper, however, at equilibrium, the individual velocity of money is the same for all households and is constant over time, 5. See Cochrane (1994) and Leeper, Sims, and Zha (1996) for early estimates, Christiano, Eichenbaum, and Evans (1999) and Uhlig (2005) for an overview, and Christiano, Eichenbaum, and Evans (2005) for recent estimates.
920
QUARTERLY JOURNAL OF ECONOMICS
so that aggregate velocity is constant. This result follows from the assumptions in that paper that households have logarithmic utility and a constant probability of being able to transfer money between the asset market and the goods market. The asset pricing implications of our model are closely related to those obtained by Grossman and Weiss (1983), Rotemberg (1984), and Alvarez and Atkeson (1997). In particular, our model has predictions for the effects of money injections on real interest rates arising from the segmentation of the asset market related to the predictions in those papers and those in Alvarez, Lucas, and Weber (2001) and Alvarez, Atkeson, and Kehoe (2002, 2007). Alvarez, Atkeson, and Kehoe (2002, 2007) study the implications of models with segmented asset markets in which households pay a fixed cost to transfer money between bank and brokerage accounts. In that paper, they focus on equilibria in which all households spend all of the money in their bank account every period so that, again, velocity is constant. Two closely related papers build on our framework by endogenizing segmentation (in the spirit of the original Baumol–Tobin model). Chiu (2007) studies a version of our model where households face a fixed utility cost of transferring resources between bank and brokerage accounts. He solves numerically for the equilibrium response of the model to a once-and-for-all increase in the money supply, starting from steady state.6 He finds that the size of the initial money growth shock plays a key role in determining the response to a shock. When the money growth shock is small relative to the fixed cost, households do not pay the fixed cost and the equilibrium dynamics are the same as in an exogenous segmentation model: a money shock leads to an offsetting fall in aggregate velocity, so that the price level responds sluggishly. But for a sufficiently large money injection relative to the fixed cost, all households pay the fixed cost, and so there is no offsetting fall in aggregate velocity and the price level responds one for one to money growth. Because of this, Chiu (2007) concludes that the results from our model are not robust to endogenous segmentation. Khan and Thomas (2007) study a version of our model where households face idiosyncratic fixed costs of transferring resources between the two accounts7 and develop flexible numerical 6. Silva (2008) computes the equilibrium response of prices to an interest rate shock in a closely related continuous time model. 7. Alvarez, Atkeson, and Kehoe (2002, 2007) use idiosyncratic fixed costs to endogenously segment asset markets, but they assume households spend all their
SLUGGISH RESPONSES OF PRICES AND INFLATION
921
methods for solving the model. They show that the distribution of the idiosyncratic fixed costs plays an important role in determining the equilibrium responses of the model to a money shock. In their benchmark calibrated example, they find that these costs actually reinforce the sluggishness of prices and reinforce the persistence of liquidity effect relative to our model. The paper proceeds as follows. We present the general model. We next present our results on the impact effects of monetary policy on prices and inflation in the analytically tractable specification of our model. We then present our quantitative exercises. In a final section, we discuss how monetary policy might affect output in a version of our model with production and a discussion of how our results compare with those on price and inflation sluggishness obtained in models with nominal rigidities. II. AN INVENTORY-THEORETIC MODEL OF MONEY DEMAND Consider a cash-in-advance economy in which the asset market and the goods market are physically separated. There is a unit mass of households, each composed of a worker and a shopper. Each household has access to two financial intermediaries: one that manages its portfolio of assets and another that manages its money held in a transactions account in the goods market. We refer to the household’s account with the financial intermediary in the asset market as its brokerage account and its account with the financial intermediary in the goods market as its bank account. There is a government that injects money into the asset market via open market operations. Households that participate in the open market operation purchase this money with assets held in their brokerage accounts. These households must transfer this money to their bank accounts before they can spend it on consumption. Time is discrete and denoted t = 0, 1, 2, . . . . The exogenous shocks in this economy are shocks to the money growth rate μt and shocks to the endowment of each household yt . Because all households receive the same endowment, yt is also the aggregate endowment of goods in the economy. Let ht = (μt , yt ) denote the realized shocks in the current period. The history of shocks is money each period so that aggregate velocity is constant and equal to one. In Khan and Thomas (2007), as in this paper, not all households spend all their money each period, and so there is a nondegenerate cross-sectional distribution of money holdings.
922
QUARTERLY JOURNAL OF ECONOMICS
denoted ht = (h0 , h1 , . . . , ht ) . From the perspective of time zero, the probability distribution over histories ht has density ft (ht ). As in a standard cash-in-advance model, each period is divided into two subperiods. In the first subperiod, each household trades assets held in its brokerage account in the asset market. In the second subperiod, the shopper purchases consumption in the goods market using money held in the household’s bank account, while the worker sells the endowment in the goods market for money Pt (ht )yt (ht ), where Pt (ht ) denotes the price level in the current period. In the next period, a fraction γ ∈ [0, 1] of the worker’s earnings is deposited in the bank account in the goods market, and the remaining 1 − γ of these earnings are deposited in a brokerage account in the asset market. We interpret γ as the fraction of total income that households receive regularly deposited into their transactions accounts or as currency. We refer to γ as the paycheck parameter and to γ Pt−1 (ht−1 )yt−1 (ht−1 ) as the household’s paycheck. We interpret 1 − γ as the fraction of total income that households receive in the form of interest and dividends paid on assets held in their brokerage accounts. Unlike a standard cash-in-advance model, households cannot transfer money between the asset market and the goods market every period. Instead, each household has the opportunity to transfer money between its brokerage account and its bank account only once every N periods. In other periods, a household can trade assets in its brokerage account and use money in its bank account to purchase goods; it simply cannot move money between these two accounts. We refer to households that currently have the opportunity to transfer money between their accounts as active households. Each period a fraction 1/N of the households are active. We index each household by the number of periods since it was last active, here denoted by s = 0, 1, . . . , N − 1. A household of type s < N − 1 in the current period will be type s + 1 in the next period. A household of type s = N − 1 in the current period will be type s = 0 in the next period. Hence, a household of type s = 0 is active in this period, a household of type s = 1 was active last period, and a household of type s = N − 1 will be active next period. In period 0, each household has an initial type s0 , with fraction 1/N of the households of each type s0 = 0, 1, . . . , N − 1. Let S(t, s0 ) denote the type in period t of a household that was initially of type s0 . The quantity of money a household s has on hand in its bank account at the beginning of goods market trade is Mt (s, ht ). The
SLUGGISH RESPONSES OF PRICES AND INFLATION
923
shopper in this household spends some of this money on goods, Pt (ht )ct (s, ht ), and the household carries the unspent balance in its bank account into next period, Zt (s, ht ). For an inactive household of type s > 0, the balance in its bank account at the beginning of the period is equal to the quantity of money that it held over in its bank account last period Zt−1 (s − 1, ht−1 ) plus its paycheck γ Pt−1 (ht−1 )yt−1 (ht−1 ). Thus, the evolution of money holdings and consumption for inactive households is (1)
Mt (s, ht ) = Zt−1 (s − 1, ht−1 ) + γ Pt−1 (ht−1 )yt−1 (ht−1 ),
(2)
Mt (s, ht ) ≥ Pt (ht )ct (s, ht ) + Zt (s, ht ).
When a household is of type s = 0, and hence active, it also chooses a transfer of money Pt (ht )xt (ht ) from its brokerage account in the asset market into its bank account in the goods market. Hence, the money holdings and consumption of active households satisfy
(3)
Mt (0, ht ) = Zt−1 (N − 1, ht−1 ) + γ Pt−1 (ht−1 )yt−1 (ht−1 ) + Pt (ht )xt (ht ),
(4)
Mt (0, ht ) ≥ Pt (ht )ct (0, ht ) + Zt (0, ht ).
In addition to the bank account constraints, equations (1)–(4) above, the household also faces a sequence of brokerage account constraints. In each period, the household can trade a complete set of one-period state-contingent bonds that pay one dollar into the household’s brokerage account next period if the relevant contingency is realized. Let Bt−1 (s − 1, ht ) denote the stock of bonds held by households of type s at the beginning of period t following history ht , and let Bt (s, ht , h ) denote bonds purchased at price qt (ht , h ) that will pay off next period if h is realized. Let At (s, ht ) ≥ 0 denote money held by the household in its brokerage account at the end of the period. Because an inactive household of type s > 0 cannot transfer money between its brokerage account and its bank account, this household’s bond and money holdings in its brokerage account must satisfy Bt−1 (s − 1, ht ) + At−1 (s − 1, ht−1 ) (5)
+ (1 − γ )Pt−1 (ht−1 )yt−1 (ht−1 ) − Pt (ht )τt (ht ) ≥ qt (ht , h )Bt (s, ht , h ) dh + At (s, ht ),
where τt (ht ) denotes real lump-sum taxes. Each household’s real bond holdings must remain within arbitrarily large bounds. The
924
QUARTERLY JOURNAL OF ECONOMICS
analogous constraint for active households is
(6)
Bt−1 (N − 1, ht ) + At−1 (N − 1, ht−1 ) + (1 − γ )Pt−1 (ht−1 )yt−1 (ht−1 ) − Pt (ht )τt (ht ) ≥ qt (ht , h )Bt (0, ht , h ) dh + Pt (ht )xt (ht ) + At (0, ht ),
where Pt (ht )xt (ht ) is the active household’s transfer of money from brokerage to bank account. At the beginning of period 0, initially inactive households be¯ 0 (s0 ) in their bank accounts in the gin with exogenous balances M goods market. This quantity is the balance on the left-hand side of (2) in period 0. For initially active households, the initial balance ¯ 0 and ¯ 0 (0, h0 ) in (4) is composed of an exogenous initial balance Z M a transfer P0 (h0 )x0 (h0 ) of their choosing. Each household also be¯ −1 (s0 ) in its brokerage account on gins with exogenous balance B the left-hand side of constraints (5) and (6). The households initially have no money corresponding to A¯ −1 (s0 ) in their brokerage accounts. For each date and state, and taking as given the prices and aggregate variables, each household of initial type s0 chooses complete contingent plans for transfers, consumption, bond, and money holdings to maximize expected utility, ∞
β
t
u[ct (s, ht )] ft (ht ) dht ,
s = S(t, s0 ),
t=0
subject to the constraints (1), (2), and (5) in those periods t in which S(t, s0 ) > 0 and constraints (3), (4), and (6) in those periods t in which S(t, s0 ) = 0. Let Bt (ht ) be the total stock of government bonds. The government faces a sequence of budget constraints, Bt−1 (ht ) = Mt (ht ) − Mt−1 (ht−1 ) + Pt (ht )τt (ht ) + qt (ht , h )Bt (ht , h ) dh , together with arbitrarily large bounds on its real bond issuance. We denote the government’s policy for money injections as μt (ht ) = Mt (ht )/Mt−1 (ht−1 ). In period 0, the initial stock of government debt ¯ −1 and M0 (h0 ) − M ¯ −1 is the initial monetary injection. This is B budget constraint implies that the government pays off its initial
SLUGGISH RESPONSES OF PRICES AND INFLATION
925
debt with a combination of lump-sum taxes and money injections achieved through open market operations. An equilibrium of this economy is a collection of prices, complete contingent plans for households, and government policy such that (i) taking as given prices and government policy, the complete contingent plans solve each household’s problem, and (ii) the N−1 ct (s, ht ) = yt (ht ), the money market goods market clears, N1 s=0 N−1 clears, N1 s=0 Mt (s, ht ) + At (s, ht ) = Mt (ht ), and the bond mar N−1 ket clears, N1 s=0 Bt (s, ht , h ) = Bt (ht , h ), for each date and state. To understand equilibrium money demand and asset prices, we examine the household’s first-order conditions. Let ηt (s, ht ) denote Lagrange multipliers on the bank account constraints (2) and (4) of household s, and let λt (s, ht ) denote Lagrange multipliers on the brokerage account constraints (5) and (6). Active households choose transfers xt (ht ) to equate the multipliers on the bank and brokerage accounts: (7)
ηt (0, ht ) = λt (0, ht ).
For households of type s the marginal utility of a dollar satisfies (8)
ηt (s, ht ) = β t
u [ct (s, ht )] ft (ht ). Pt (ht )
The multipliers on the bank accounts satisfy the inequalities t (9) ηt (s, h ) ≥ ηt+1 (s + 1, ht , h ) dh , which hold with equality if Zt (s, ht ) > 0. Combining (8) and (9), we have the consumption Euler equations that determine a household’s money demand, ft+1 (ht , h ) u [ct+1 (s + 1, ht , h )] Pt (ht ) dh , (10) 1≥ β u [ct (s, ht )] Pt+1 (ht , h ) ft (ht ) which again holds with equality if Zt (s, ht ) > 0. The evolution of the marginal utility of a dollar in the brokerage account is determined by state-contingent bond prices: (11)
qt (ht , h ) =
λt+1 (s + 1, ht , h ) . λt (s, ht )
Under the assumption that initial conditions are such that the initial Lagrange multipliers on the brokerage account λ0 (s0 ) are
926
QUARTERLY JOURNAL OF ECONOMICS
the same for all households,8 equations (7), (8), and (11) together imply that state-contingent bond prices are then given by (12)
qt (ht , h ) = β
ft+1 (ht , h ) u [ct+1 (0, ht , h )] Pt (ht ) . u [ct (0, ht )] Pt+1 (ht , h ) ft (ht )
The nominal interest rate is then found from the price of a noncontingent bond paying interest it (ht ) in nominal terms: 1 = qt (ht , h ) dh 1 + it (ht ) ft+1 (ht , h ) u [ct+1 (0, ht , h )] Pt (ht ) dh . (13) = β u [ct (0, ht )] Pt+1 (ht , h ) ft (ht ) In what follows, we will characterize equilibrium in an analytically tractable specification of our model using methods similar to those used in a Lucas-tree economy (see Lucas [1978]). That is, we will find the allocations of money and consumption across households implied by market clearing and then solve for asset prices in terms of marginal utilities using the first-order conditions that link bond prices to ratios of marginal utilities above. To gain intuition as to how these prices lead households to choose to purchase more or less money in an open market operation, as required in equilibrium to match the central bank’s policy for money injections, we find it useful to recast these firstorder conditions in terms of the date-zero asset prices implied by our state-contingent bond prices. Specifically, let Qt (ht ) denote the price in period 0 of one dollar delivered in the asset market in period t following history ht . These prices satisfy the recursion Qt (ht ) = Qt−1 (ht−1 )qt−1 (ht−1 , ht ) for t ≥ 1. From (11) and the recursion for date zero prices we then have that for all households, (14)
Qt (ht ) = λt (s, ht ).
Again, using the assumption that initial conditions are such that the initial Lagrange multipliers on the brokerage account λ0 (s0 ) are the same for all households, from (7) and (8), we have that asset prices are determined by the marginal utility for active ¯ 0 (s0 ) 8. This can be ensured by an appropriate choice of initial bond holdings B or with the assumption that households trade securities contingent on their initial type s0 in an initial asset market before they learn this type.
SLUGGISH RESPONSES OF PRICES AND INFLATION
927
households: (15)
Qt (ht ) = β t
u [ct (0, ht )] ft (ht ). Pt (ht )
A large money injection at t and ht is associated with a low datezero price Qt (ht ) and large purchases of money by those households that are currently active (obtained by selling bonds). These active households then transfer this money immediately to their bank accounts and begin spending it, so the low date-zero price Qt (ht ) is associated with high consumption ct (0, ht ) for households that happen to be active at this date. Likewise, a small money injection at t and ht is associated with a high date-zero price Qt (ht ) and small purchases of money and low consumption by those households that are currently active. The mechanism through which money injections in this model have an impact on “real” asset prices is also most easily understood in terms of these date-zero asset prices. We can define a real asset price as the price at date zero of a claim to sufficient cash to purchase one unit of consumption at date t following history ht . This price is given by Qt (ht )Pt (ht ). Note from (15) that this asset price is equal to the marginal utility of consumption of the households that are active at date t. In a standard cash-in-advance model, all households are active at each date and consumption is exogenous, so this real asset price is invariant to the specification of monetary policy. As we show below, in our model, money injections redistribute cash holding across households and thus impact the consumption of the subset of agents who are active at a given date. Corresponding to this redistributive effect, in our model, money injections thus also impact real asset prices in equilibrium. To this point, we have made explicit reference to uncertainty in the notation in order to give a clear characterization of state-contingent asset prices. For the remainder of the paper we suppress reference to histories ht to simplify notation. The inequalities governing money demand can therefore be written u [ct+1 (s + 1)] Pt , (16) 1 ≥ Et β u [ct (s)] Pt+1 with strict equality if Zt (s) > 0, whereas the price for bonds can be written u [ct+1 (0)] Pt 1 . (17) = Et β 1 + it u [ct (0)] Pt+1
928
QUARTERLY JOURNAL OF ECONOMICS
III. HOW THE MODEL WORKS In this section, we solve our model for a special case that is analytically tractable to demonstrate how the model works. In this special case, agents have utility u(c) = log(c) and the paycheck parameter is γ = 0. Given these assumptions, households of type s spend a constant fraction v(s) of their current money holdings and carry the remaining fraction 1 − v(s) into the next period, irrespective of the future paths of money and prices. As a result of the fact that agents choose this simple pattern of expenditure, we can, in this special case, solve analytically for the dynamic, stochastic equilibrium of our model. We use this analytical example to first show how the price level responds sluggishly to an exogenous change in money growth and then show how inflation responds sluggishly to an exogenous change in the nominal interest rate. In the next section, we explore the quantitative implications of our model for illustrative examples in which household expenditure does vary with the future paths of money and prices because agents have preferences other than log utility and/or the paycheck parameter is positive. In presenting this version of the model, we allow the length of a time period to be an arbitrary > 0 units of calendar time (measured in fractions of a year). We continue to use t to count time periods, so after t periods, t units of calendar time have passed. We refer to flow variables such as consumption at annual rates so that ct is consumption in period t. Likewise, the discount factor for the flow utility is β , where β reflects discounting in preferences at an annual rate. We let T > 0 denote the calendar length of time between activity for households, so that N = T / is the number of periods that elapse between activity. We first derive results for an arbitrary length of a period, , and then focus attention on particularly simple formulas that obtain when we let → 0 for fixed T (so that N approaches infinity). We focus on the case of an arbitrarily small time period to show that the time period in our model does not have any economic significance and because this helps simplify the resulting formulas. For purposes of exposition, we leave all the algebraic details to the Appendix. In our analysis here, we assume that, at equilibrium, nominal interest rates are positive, so that households choose not to hold money in their brokerage accounts, where money is dominated in rate of return by bonds, and that the opportunity cost of holding money in a bank account is high, so that those households that
SLUGGISH RESPONSES OF PRICES AND INFLATION
929
are about to transfer money between their brokerage and bank accounts do not hold money in their bank accounts. These conditions are analogous to the cash-in-advance constraint binding in a standard cash-in-advance model (this model with N = 1). After solving the model under these assumptions, one can use equations (16) and (17) to check the first-order conditions governing these two assumptions regarding money holdings. III.A. Money and Velocity In our model, households periodically withdraw money from the asset market and then spend that money slowly in the goods market to ensure that it lasts until they have another opportunity to withdraw money from the asset market. As a result, households’ equilibrium paths for money holdings have the familiar saw-toothed shape characteristic of inventory-theoretic models of money demand. Here we discuss how this saw-toothed pattern of money holdings shapes our model’s implications for the dynamics of money, velocity, and prices. Given our assumption that households have utility u(c) = log(c) and the paycheck parameter is γ = 0, households’ money holdings and nominal spending at period t for a period of length are given by (18)
Mt+1 (s + 1) = (1 − v(s))Mt (s) and Pt ct (s) = v(s)Mt (s)
with (19)
v(s) ≡
1 1 − β . 1 − β (N−s)
We refer to the fraction v(s) as the individual velocity of money at an annual rate and to v(s) as the individual velocity in period t. Note that, in this special case of our model, these individual velocities of money are constant over time regardless of expectations for the future path of money and prices. Observe that these individual velocities v(s) converge to 1/(N − s) as β approaches one. In this limiting case, the nominal expenditure of each household is constant over time, as is assumed in the original Baumol–Tobin framework. Given that individual velocities v(s) are constant in this specification of our model, aggregate velocity for any date or state is simply a function of the distribution of money across households with different individual velocities. If the nominal interest rate is
930
QUARTERLY JOURNAL OF ECONOMICS
positive, so that households do not hold any money in the asset market, money market clearing implies Mt =
(20)
N−1 1 Mt (s). N s=0
N−1 Accordingly, we interpret Mt (s)/Mt s=0 as the distribution of money holdings across households. Goods market clearing then implies that the aggregate velocity of money is a weighted average of the individual velocities of money, where the weights are given by the distribution of money holdings across households,
(21)
vt ≡
N−1 N−1 Pt yt 1 Pt ct (s) 1 Mt (s) , = = v(s) Mt N Mt N Mt s=0
s=0
where vt is aggregate velocity at an annual rate. In a steady state with constant money growth, the distribution of money holdings across households of different types is constant. Hence, aggregate velocity is also constant, and the steady-state inflation rate is equal to the money growth rate. Therefore, our model predicts that in the long run, along a steadystate growth path, the price level and the money supply grow together, whereas the aggregate velocity of money stays constant. Out of steady state, however, as a result of the fact that the individual velocities of money v(s) vary across households with different values of s, fluctuations in aggregate money growth cause fluctuations in the distribution of money across households, and this in turn causes fluctuations in aggregate velocity. More specifically, the dynamics of prices, velocity, and money are determined by two factors: first, the differences in individual velocities v(s) across households of different types, and second, the effect of a money injection on the distribution of money holdings across households. How these factors affect fluctuations in aggregate velocity can be understood intuitively as follows. First, consider the differences in individual velocities v(s). These measures of individual velocity equal the flow of consumption obtained by each household relative to its money holdings at the beginning of the period. From (19), we immediately see that v(s) is increasing in s. A household of type s close to zero holds a large stock of money relative to its consumption, whereas
SLUGGISH RESPONSES OF PRICES AND INFLATION
931
a household of type s close to N − 1 holds only a small stock of money relative to its consumption. Next consider how a money injection affects the distribution of money across households. From (18), the evolution of the distribution of money for households of type s = 1, . . . , N − 1 is given by (22)
Mt−1 (s − 1) 1 Mt (s) = (1 − v(s − 1)) , Mt Mt−1 μ t
using μt = (Mt /Mt−1 )1/ to denote money growth at an annual rate. Because the distribution of money must sum to one, the money holdings of active households are (23)
N−1 1 Mt (0) 1 Mt−1 (s − 1) 1 =1− (1 − v(s − 1)) . N Mt N Mt−1 μ t s=1
Given an initial distribution of money holdings across households and a process for money growth μt , equations (22) and (23) completely characterize the equilibrium dynamics of the distribution of money holdings across households and hence the equilibrium dynamics of aggregate velocity and the price level. This law of motion for the distribution of money has two key implications. First, in response to an increase in the money supply, aggregate velocity falls and thus the price level responds less than one for one with the money supply. Hence, prices in this model are sluggish in that they move less than would be predicted by the simplest quantity theory. Specifically, the proportional response of prices on impact is roughly half as large as the proportional change in the supply of money. Second, there is a persistently sluggish response of prices to changes in the quantity of money, and the extent of persistence is increasing in the calendar length of time between periods of activity. To see these implications, consider first the impact of a money injection on velocity. By redistributing money toward the active households, an increase in the supply of money tilts the distribution of money holdings toward agents with low individual velocities and away from agents with high individual velocities, lowering aggregate velocity. To see this result more formally, we proceed in two steps. In the first step, we derive the elasticity of velocity with respect to money growth for an arbitrary period length and show that the elasticity is negative—so that on impact,
932
QUARTERLY JOURNAL OF ECONOMICS
velocity declines when money growth increases. In the second step, we consider the case of an arbitrarily small period length. To derive the elasticity of velocity with respect to money growth in period t analytically from equations (21), (22), and (23), observe that
∂ vt μ t = v(0). (24) ∂μ t The elasticity of velocity with respect to money growth in period t is thus given by 1 ∂ log(vt ) v(0) − vt ∂(vt μ t )
= (25) − vt = . v vt ∂μ ∂ log μt t t Because the individual velocity of active households is less than the aggregate velocity (v(0) < vt ), aggregate velocity declines when money growth increases. Given the exchange equation Mt vt = Pt yt , we see that the price level does not respond to impact one for one with an increase in the money supply, because that increase in the money supply leads to an endogenous decrease in aggregate velocity. To quantify this elasticity, we evaluate velocity at steady ¯ To simplify the formulas, we suppose the steadystate, vt = v. state money growth rate is μ¯ = 1 and the time discount factor β → 1, so that the steady-state real return to holding money, β/μ, ¯ also goes to one. In this limiting case, the expenditure of each household is constant over time, as in the original Baumol–Tobin framework. In this limit, individual velocity of active households per period v(0) = /T and steady-state aggregate velocity per period v ¯ = 2/(T / + 1) so that, under these assumptions, the elasticity of aggregate velocity with respect to period money growth is (26)
1 T / − 1 ∂ log(v) =− ∂ log(μ ) 2 T /
and
∂ log(π ) 1 T / + 1 = , ∂ log(μ ) 2 T /
where these derivatives are evaluated at steady state and where π denotes the inflation rate. We can see here that if T = , so that N = 1, as in a standard cash-in-advance model, inflation responds one for one with the shock to money growth and velocity is constant. In contrast, if for fixed T we take → 0, then inflation responds only half as much as money growth. This result follows from the geometry of money
SLUGGISH RESPONSES OF PRICES AND INFLATION
933
holdings implied by an inventory-theoretic model—a household that has just replenished its bank account will hold roughly twice as much money as an average household and hence have roughly half the velocity of the average household. Note that here, as we consider the limit as the time period shrinks to zero, we also shrink the magnitude of the money injection to zero. To be able to properly interpret the impact effect, we now specify our model with a small yet finite value of and consider the effect of a sequence of money injections carried out gradually, one per model period, that cumulate over time to a sizable injection. To be specific, we set to correspond to a day and calculate the effects of a total increase in the money supply of 1% accomplished via a sequence of equal-sized money injections, one per model period, over the course of one month, that is, a money injection that increases the money supply by 1/30th of 1% for thirty days, a shock of 0.0333% each day for thirty days. Our analytical results characterize the response of velocity and prices to the money injection on the first day, because we start the model off from a steady state. After the first day, however, the distribution of money holdings across households is no longer in steady state and we must track the impact of the remaining money injections numerically. Figure II illustrates the dynamics of money, velocity, and prices following this shock. In response to this money injection, aggregate velocity falls and the price level responds less than one for one with the change in the money supply. As we showed analytically, the elasticity of velocity with respect to money growth near steady state is approximately −1/2. The impact of the first day’s money injection on velocity is −0.0166%, very close to the analytical value of 0.5 × −0.0333% to be expected. Tracking the effects of the remaining 29 money injections gives the cumulative effect of this sequence of money injections at time t = 30 days on velocity of −0.48%, approximately −1/2 of the cumulative shock of 1.00% that was introduced over those thirty days. In the figure, we trace out the dynamics of money and prices for a total of 300 days (or ten months). Over time, aggregate velocity and prices rise, even overshooting their steady-state levels, and then gradually converge to steady state with dampened oscillations. The results displayed in Figure II regarding the impact of a 1% increase in the money stock carried out over one month are very similar to the results that we obtain when we simply
934
QUARTERLY JOURNAL OF ECONOMICS
FIGURE II Money Up, Velocity Down, Prices Sluggish The dynamics of money, velocity, and prices following a money growth shock. In this exercise, the money growth rate increases by 1/30th of 1% for 30 days. In response to this money injection, aggregate velocity falls and the price level responds less than one for one with the change in the money supply. We showed analytically that the elasticity of velocity with respect to money growth near steady state is approximately −1/2. We find that after thirty days (one month), the cumulative effect of this sequence of money injections on velocity is −0.48%, approximately −1/2 of the cumulative shock of 1% that was introduced over those thirty days. Our analytical results characterize the response of velocity and prices to the money injection on the first day, because we start the model off from a steady state. After the first day, however, the distribution of money holdings across households is no longer in steady state and we must track the impact of the remaining money injections numerically.
set the length of the model period to correspond to one month and calculate the effect of a 1% increase in the money supply accomplished in a single model period (the corresponding figure is available upon request). The dynamics of velocity following a shock can be understood as follows. Because the money growth rate is high for only one month, from (22) we see that the households that were active at the time of the money injection carry an abnormally large stock of money until they next have the opportunity to transfer funds
SLUGGISH RESPONSES OF PRICES AND INFLATION
935
from their brokerage accounts. As shown in (19), their individual velocities rise each period until this next visit occurs. Thus, aggregate velocity remains below its steady-state level for a time initially, as these agents have a low individual velocity, and then rises past its steady-state level, as the individual velocity for these agents rises. After N months these agents have spent all of their money and they visit the asset market again. If this were the only effect, we would expect aggregate velocity to return to its steady-state value in N/2 months. However, we show in the Appendix that aggregate velocity remains below its steady-state value for approximately N log(2) months, well over N/2 months (because log(2) ≈ 0.69). In this sense, there is persistence in the sluggish response of prices to changes in the quantity of money and this persistence is increasing in N. The periodic structure of the model introduces a sequence of dampened oscillations in velocity as the changes in the distribution of money holdings work their way through the system. After the first N months, however, these effects are quite small. III.B. Interest Rates and Inflation Until now, we have taken as given the path of money growth and examined our model’s implications for the responses of velocity and the price level to a shock to money growth. An alternative approach is to discuss monetary policy in terms of interest rates and solve endogenously for the responses of money growth, velocity, and inflation consistent with a shock to nominal interest rates. We turn now to such an analysis. Here we show our main result that, on impact, inflation responds sluggishly to a shock to interest rates. We demonstrate analytically that the response of inflation to a change in the nominal interest rate is sluggish in our model when N is large, again under the assumptions that u(c) = log(c) and γ = 0 so that individual velocities v(s) are time-invariant. We solve for the responses of money growth, velocity, and inflation to a change in the nominal interest rate in a deterministic setting. Specifically, we assume the nominal interest rate, inflation, money growth, and the distribution of money holdings across households (and hence velocity) are all initially at steady-state values corresponding to a constant interest rate ¯ı. We fix at t = 0 an increase in the nominal rate above steady state, i0 > ¯ı. We solve for the response of inflation, money growth, and velocity consistent with this change in the nominal interest rate.
936
QUARTERLY JOURNAL OF ECONOMICS
To solve for these responses, we use the pricing formula for nominal bonds (17). In a deterministic setting, this formula can be rewritten as a Fisher equation relating nominal interest rates, real interest rates, and inflation between the current period and the next, (27)
ˆıt = rˆt + πˆ t+1 ,
where a circumflex denotes log deviation from steady state and where we repeatedly use approximations of the form log(1 + it ) ≈ it . We use this Fisher equation to find a path for money growth such that the implied paths for inflation and the real interest rate are consistent with the exogenously specified path for the nominal interest rate. Recall that, in our model, changes in the path of money growth have an impact on velocity, inflation, and real interest rates, with the magnitude of these changes depending on N. As a benchmark, consider first the responses of money growth, velocity, and inflation when N = 1 (so that our model is a standard constant-velocity cash-in-advance model). With N = 1, all households are active, velocity is constant, and the consumption of active households is also constant at ct (0) = y. As a result, in this case, inflation is equal to money growth (πˆ t+1 = μˆ t+1 ) and the real interest rate is constant (ˆrt = 0). With these results, we see that any path of money growth that is consistent with our exogenously specified path of nominal interest rates must have money growth μˆ 1 and inflation πˆ 1 responding one for one to the change in the nominal interest rate in period 0. That is, μˆ 1 = ˆı0 . Clearly, in this case, the response of inflation from period t = 0 to t = 1 anticipated in period t = 0 in response to the change in the nominal interest rate ˆı0 is not at all sluggish. Our solution of the model in this benchmark case with N = 1 is not yet complete, as we have not solved for the equilibrium responses of money growth μˆ 0 and inflation πˆ 0 on impact, at date t = 0. It is well known that in this textbook cash-in-advance model (N = 1), this initial money growth rate and inflation rate are not determinate under an exogenous interest rate rule. We resolve the indeterminacy by choosing the particular path of money growth μˆ 0 so that, on impact, inflation from the last period to the current period does not respond to the change in the nominal interest rate in the current period (i.e., so that πˆ 0 = 0). In the model with N = 1, this is achieved by setting μˆ 0 = 0. This resolution of the
SLUGGISH RESPONSES OF PRICES AND INFLATION
937
indeterminacy is equivalent to assuming that the price level in period t = 0 does not respond to the change in the nominal interest rate and hence is consistent with the schemes used to identify shocks to monetary policy discussed in Christiano, Eichenbaum, and Evans (1999). Note that this resolution of the indeterminacy fixes the responses of money growth and inflation at date t = 0 by assumption. Of interest are the equilibrium values of money growth and inflation at date t = 1, μˆ 1 and πˆ 1 . We now turn to the case of a general N > 1. At the end of this section, we show that this indeterminacy of the initial money growth rate μˆ 0 given the exogenous path of the nominal interest rate extends to our setting with N > 1. In particular, we show that, as in the case with N = 1, there is a continuum of paths of money growth consistent with a given path of nominal interest rates. As in the case with N = 1, with N > 1, this continuum has only one dimension; that is, these paths can be indexed by their initial money growth rates μˆ 0 despite the fact that this model has a nondegenerate distribution of money holdings across households as a state variable that is absent from the model with N = 1. Here, we again resolve this indeterminacy by examining the path of money growth consistent with πˆ 0 = 0. Given our assumption of log utility and γ = 0, so that individual velocities are constant over time, this path of money growth has initial money growth at its steady-state level μˆ 0 = 0. Given this result that μˆ 0 = 0 under our resolution of the indeterminacy under an interest rate rule, we solve for the equilibrium responses of money growth μˆ 1 , velocity vˆ1 , and inflation πˆ 1 to the change in the nominal interest rate ˆı0 in period t = 0 by finding the value of money growth μˆ 1 such that the equilibrium responses of the real interest rate rˆ0 and inflation πˆ 1 are consistent with the assumed movement in the nominal interest rate. We solve for each of these responses in turn. Consider first the response of the real interest rate rˆ0 to a change in money growth μˆ 1 . This real interest rate is determined by the growth of the consumption of active households according to rˆ0 = cˆ1 (0) − cˆ0 (0). Given that the individual velocity for active households v(0) is constant over time, the consumption of active households is given by ct (0) = v(0)mt (0)Mt /Pt , where mt (0) = Mt (0)/Mt is the share of the money supply held by active households. The real interest rate can therefore be written (28)
ˆ 1 (0) − m ˆ 0 (0) + μˆ 1 − πˆ 1 . rˆ0 = m
938
QUARTERLY JOURNAL OF ECONOMICS
Given that initial inflation and money growth are at their steadystate values, and given our assumed initial conditions, the distribution of money holdings across households at date t = 0 is equal to its steady-state value, and hence the share of the money supply held by active households, mt (0), and the velocity, vt , are also equal to their steady-state values. Thus, we have m ˆ 0 (0) = 0 and ∂ log(π ) ∂ log(m(0)) +1− μˆ 1 , (29) rˆ0 = ∂ log(μ) ∂ log(μ) where ∂ log(m(0))/∂ log(μ) and ∂ log(π )/∂ log(μ) are the elasticities of the share of money held by active households and of inflation with respect to money growth, both evaluated at the steady state. From (28), these results then imply that the money growth required in period 1 to implement the nominal interest rate ˆı0 in period 0 is given by ⎡ ⎤ (30)
⎢ μˆ 1 = ⎢ ⎣
⎥ 1 ⎥ ˆı0 . ∂ log(m(0)) ⎦ 1+ ∂ log(μ)
Thus, the real interest rate and inflation rate are given by ⎡ ⎡ ⎤ ⎤ ∂ log(π ) ∂ log(π ) ⎢ ⎢ ⎥ ⎥ ∂ log(μ) ∂ log(μ) ⎥ ˆı0 and πˆ 1 = ⎢ ⎥ ˆı0 . rˆ0 = ⎢ ⎣1 − ⎣ ⎦ ∂ log(m(0)) ∂ log(m(0)) ⎦ 1+ 1+ ∂ log(μ) ∂ log(μ) (31) To discuss these formulas, we return to the setting where periods are measured in units of calendar time, with T > 0 denoting the calendar length of time between activity, so that N = T / is the number of periods that elapse between activity. As we can see from these formulas, the difference between our model and the standard model with T = comes through the terms ∂ log(m(0))/∂ log(μ) and ∂ log(π )/∂ log(μ) reflecting the elasticities of the share of money held by active households and of inflation with respect to a money injection. In the standard model with T = (i.e., N = 1), a money injection has no effect in terms of redistributing money holdings across households, so that this elasticity is zero and the elasticity of inflation with respect to money growth is one. Thus, as we have seen, in this case, money growth and inflation respond one for one with the nominal interest rate and the real interest rate remains constant. In contrast, with
SLUGGISH RESPONSES OF PRICES AND INFLATION
939
T > (i.e., N > 1), the elasticity of the share of money holdings of active households with respect to money growth is positive and grows large as → 0. Specifically, we show in the Appendix that, taking the limit as β/μ¯ → 1, the elasticity of the money share of active agents is approximately ∂ log(m(0)) T / − 1 = . ∂ log(μ) 2
(32)
And, as we showed above, the elasticity of inflation is ∂ log(π)/∂ log(μ) = (T / + 1)/2(T /), which is less than one for T > and falls toward 1/2 as → 0. Plugging in these expressions for the elasticities gives (33)
μˆ 1 =
2 ˆı0 T / + 1
and
πˆ 1 =
1 ˆı0 T /
and that the real interest rate is (34)
rˆ0 =
T / − 1 ˆı0 . T /
The size of the response of real interest rates to a change in the nominal interest rate on impact is measured by (T / − 1)/(T /), which is decreasing in . For small , a given increase in the nominal interest rate gives rise to a nearly one-for-one increase in the real rate and almost no increase in expected inflation. The small response of inflation to a change in interest rates comes from segmented asset markets: only the fraction /T (i.e., 1/N) of households that are active receive the entire increase in the money supply, and so a given money injection has a disproportionately large impact on the marginal utility of a dollar for these households. Therefore, for small , a given change in nominal interest rates is obtained with a small change in money growth because that small change in the money supply has a large impact on real interest rates. Inflation is sluggish when is small because this small change in money growth leads only to a small change in inflation. In our model, taking → 0 has two effects that together contribute to the sluggish response of inflation—reducing increases the elasticity of the share of money held by active households and lowers the elasticity of inflation with respect to a change in money growth. The more important of these two effects is the first one. To see this, consider a constant velocity model in which agents are permanently divided into a fraction λ who are always active and
940
QUARTERLY JOURNAL OF ECONOMICS
a remaining fraction 1 − λ who are never active, as in Alvarez, Lucas, and Weber (2001). Using the same resolution of the indeterminate price level, the relationship between real and nominal rates on impact is still given by (31) above. Because aggregate velocity is constant in this alternative model, ∂ log(π )/∂ log(μ) = 1. It can also be shown that, in this case, the elasticity of the share of money held by the permanently active agents to money growth is ∂ log(m(0))/∂ log(μ) = (1 − λ)/λ. Therefore, the response of the real rate is rˆ0 = (1 − λ)ˆı0 .
(35)
So if the fraction of agents who are always active in this alternative model is λ = /T (i.e., λ = 1/N), then the alternative model with constant velocity gives the same response of inflation on impact to a change in the nominal interest rate as our model with variable velocity. In this sense, our result that the response of inflation to a change in interest rates is sluggish is driven mainly by asset market segmentation and not variable velocity. For the remainder of this paper, for computational simplicity, we fix the period length to = 1 month so that N = T is the calendar length of time between activity in months. We now present the indeterminacy result that holds in our model. PROPOSITION 1. Let {it∗ }∞ t=0 be a given sequence of nominal inter∗ (s) be the initial distribution of money est rates and M−1 holdings across households. Let {Mt∗ , Mt∗ (s), ct∗ (s), Pt∗ }∞ t=0 be an equilibrium corresponding to this sequence of interest rates and these initial conditions. Then, for each M0 in an open neighborhood of M0∗ , there exists a unique equilibrium {Mt , Mt (s), ct (s), Pt }∞ t=0 consistent with the same path of inand initial distribution of money holdings terest rates {it∗ }∞ t=0 ∗ (s). In this alternative equilibrium, for t ≥ N, the disM−1 tributions of consumption, money growth, and inflation are unchanged in that ct (s) = ct∗ (s),
M∗ Mt+1 = t+1 , Mt Mt∗
and
P∗ Pt+1 = t+1 . Pt Pt∗
For periods t = 0, . . . , N − 1, however, the distributions of consumption, money growth, and inflation all depend on the value of M0 .
SLUGGISH RESPONSES OF PRICES AND INFLATION
Proof. See the Appendix.
941
This indeterminacy result reduces to the standard indeterminacy result when N = 1. (See, e.g., Woodford [2003b, Chapter 2] for an extended discussion.) And because for each M0 there is a unique alternative equilibrium, even for N > 1 the indeterminacy is one-dimensional, as in the standard model. However, for N > 1, this indeterminacy result differs from the standard result in that the distribution of consumption across agents and the path of money growth and inflation differ across these equilibria for the first N periods. Hence, for N > 1, this indeterminacy has implications for real quantities and the real interest rate despite the fact that prices are fully flexible. IV. QUANTITATIVE EXERCISES The setup used in the preceding section, with u(c) = log(c) and γ = 0, simplifies calculations, because individual velocities v(s) are time-invariant. In the case where γ > 0 or for general u(c) the dynamics are more complex, because households’ expenditure decisions will be forward-looking and consequently individual velocities will be time-varying. Below, we examine the quantitative implications of our model for the persistence of the sluggish response of prices to money and inflation to interest rates under alternative parameterizations of our model numerically. We characterize the responses of prices and inflation numerically with values of the parameters N and γ chosen so that our model reproduces both the average level of velocity for a broad monetary aggregate held by U.S. households and the fraction of personal income that is received as wage and salary disbursements.9 We then conduct two exercises with the model to illustrate its quantitative implications. In the first exercise, we examine our model’s quantitative implications for the response of velocity to changes in money growth. In this experiment, we feed into the model the sequences of money growth and aggregate consumption shocks observed in U.S. data and compare the model’s implications for short-run fluctuations 9. The other parameters we need to assign are standard. We set the length of the time period to be a month; the time discount factor β = 0.991/12 , that is, a 1% annual rate; and the steady-state money growth to be μ¯ = 1.011/12 , also a 1% annual rate, which is consistent with a 2% annual opportunity cost of money, as discussed below. We set the coefficient of relative risk aversion to one, that is, log utility.
942
QUARTERLY JOURNAL OF ECONOMICS
in velocity with those observed in the data. We find that velocity in the model is highly correlated with velocity in the data. The magnitude of the fluctuations in the model, however, is significantly smaller than the magnitude of those observed in the data. In the second exercise, we examine the responses of money, prices, and velocity in the model to a monetary policy shock represented as a persistent movement in the nominal interest rate similar to those estimated as the response of the Federal Funds rate to a monetary policy shock in the VAR literature. Here we find that the corresponding impulse responses of money and prices implied by our model are similar to those estimated in the VAR literature. In particular, inflation in the model responds quite sluggishly to the change in interest rates. IV.A. Choosing N and γ In specifying our model, we have assumed that households hold their financial assets in two separate accounts, which we term a bank account and a brokerage account. The bank account is used to purchase consumption and offers a low rate of return on the assets deposited there, whereas the brokerage account can be used to hold a wide array of high-yielding financial assets. Transfers between the two accounts are assumed to be infrequent. To map the parameters of the model to observables in the data, we must interpret the theoretical objects in the model in terms of actual financial institutions in the data. Our preferred interpretation is to map the bank accounts in the model to what is called “retail banking” in the data, whereas the brokerage accounts in the model correspond to the array of actual brokerage accounts, mutual fund shares, pension funds, life insurance reserves, and equity in noncorporate businesses within which households hold claims on financial assets in a form that is not readily accessible for consumption purposes. We choose this interpretation of bank and brokerage accounts in our model based on the observation, documented in the Appendix, that U.S. households pay a substantial cost (on the order of two percentage points) in terms of foregone interest to hold assets in retail banks relative to shortterm Treasury securities. The evidence that we present indicates that there is no substantial difference in the opportunity cost of demand deposits (in M1) and the components of M2 (savings and time deposits) that we consider as part of our monetary aggregate. Our interpretation of bank and brokerage accounts differs from the traditional interpretation of Baumol–Tobin models,
SLUGGISH RESPONSES OF PRICES AND INFLATION
943
where withdrawals are made from a safe interest-bearing asset into cash. Instead, we interpret the bank accounts as a broader monetary aggregate, and the account from which these transfers are made as one with high-yield managed portfolios of risky and riskless assets. Our interpretation is similar to those in the models of Duffie and Sun (1990) and Abel, Eberly, and Panageas (2007). We measure U.S. households’ holdings of accounts in retail banks using the flow of funds accounts.10 From the flow of funds accounts, we observe that U.S. households hold a large quantity of such accounts—on the order of 1/2 to 2/3 times annual personal consumption expenditure. We use the implied average annual level of velocity of 1.5 to 2.0 as one statistic to guide our choice of N and γ for the quantitative results that follow. The other statistic that we use is based on our interpretation that the paycheck parameter in the model corresponds to regular wage and salary income automatically deposited in bank accounts in the data. Accordingly, as a baseline, we choose γ = 0.6 to match the fraction of personal income that is received as wage and salary disbursements observed in the data.11 The steady-state velocity implied by our model is a simple function of the parameters N and γ . In particular, holding N fixed, the model’s implications for steady-state velocity are an increasing function of the paycheck parameter γ because the automatic deposit of paychecks into households’ bank accounts allows faster circulation of money. In the example with u(c) = log(c) and γ = 0 that we used for intuition in the preceding sections, with β/μ¯ close to one, aggregate velocity is given by v¯ = 2/(N + 1). With γ > 0, for β/μ¯ close to one, aggregate velocity is well approximated by v¯ = 2/(N + 1)(1 − γ ), which increases as γ increases. Given our choice of γ to match the fraction of personal income that is received as wage and salary disbursements, we choose the 10. In terms of measuring the relative sizes of these accounts using data from the Flow of Funds Accounts of the United States (Federal Reserve Board, 2007), our interpretation corresponds to the following breakdown of the data presented in Table B.100, Balance Sheet of Households and NonProfit Organizations. Total Financial Assets for households are listed on line 8 ($45,405 billion in 2007). We interpret line 9, Deposits ($7,334 billion in 2007), as corresponding to assets held in bank accounts. This category includes checkable deposits and currency, time and savings deposits, and money market shares. We interpret the remaining financial assets listed on line 14, Credit Market Instruments, and lines 23–29 including, among other things, corporate equities, mutual fund shares, life insurance reserves, pension fund reserves, and equity in noncorporate business, as corresponding to assets held in the households’ brokerage accounts. 11. From Table 2.1 of the National Income and Product Accounts of the United States (U.S. Department of Commerce, Bureau of Economic Analysis), we observe that this fraction has been equal to 60% on the average over the period 1959–2007.
944
QUARTERLY JOURNAL OF ECONOMICS
remaining parameter N to match average velocity of 1.5 on an annual basis. We choose the length of a period to be one month and as a baseline use N = 38, so that with γ = 0.6 the model produces an average velocity of 1.5. With these parameters, our model implies that households transfer money between their brokerage accounts and bank accounts very infrequently—on the order of only once every three years. Now we argue that this assumption is not inconsistent with the available microeconomic evidence on the frequency with which agents trade financial assets held outside of their bank accounts. The first set of such microeconomic data concerns the frequency with which households trade equity. Such data are relevant because a household would have to trade equity to rebalance its portfolio between funds held in its bank account and equity held in its brokerage account. The Investment Company Institute (2002) conducted an extensive survey of households’ holdings and trading of equity in 1998 and 2001. They report the frequency with which households traded stocks and stock mutual funds in each year. Averaging across the 1998 and 2001 surveys, 48% of the households neither bought nor sold stocks, and 68% of the households neither bought nor sold stock mutual funds in 1998 and 2001. Because a household would have to buy or sell some of these assets to transfer funds between these higher-yielding assets held in a brokerage account and a lower-yielding bank account, these data, interpreted in the light of our model, would indicate choices of N ranging from roughly 24 (for roughly one-half of households trading these risky assets at least once within the year) to roughly 36 (for roughly one-third of households trading within the year).12 The second set of microeconomic data is that presented by Vissing-Jorgensen (2002). She studies micro data on the frequency of household trading of stocks, bonds, mutual funds, and other risky assets obtained from the Consumer Expenditure Survey. In Figure 6 in her paper, she shows the fraction of households that bought or sold one of these assets over the course of one year 12. These data may also overstate the frequency with which households transfer funds between their equity accounts and their transactions accounts because some of the instances of equity trading are simply reallocations of the equity portfolio. The Investment Company Institute reports that more than 2/3 of those households that sold individual shares of stock in 1998 reinvested all of the proceeds, whereas 57% of those households that sold stock mutual funds reinvested all of the proceeds. In the context of our model, reallocation of the household portfolio in the asset market is costless and does not generate cash that can be used to purchase goods.
SLUGGISH RESPONSES OF PRICES AND INFLATION
945
as a function of their financial wealth at the beginning of the year. She finds that the fraction of agents who traded one of these assets ranges from roughly one-third to one-half of the households owning these assets at the beginning of the year. Again, given our interpretation that households hold stocks, bonds, mutual funds, and other risky assets in their brokerage accounts, these data would lead us to choose N between 24 and 36. If a higher proportion of income is automatically available for spending (without the need for a transfer from the brokerage account), so that γ is higher than 0.6, then the chosen value for N needs to be correspondingly higher to keep the steady-state aggregate velocity constant. For example, to match v¯ = 1.5 annual with the higher γ = 0.7 needs about N = 52 months. If we interpret our model in terms of a narrower monetary aggregate with correspondingly higher velocity, then the chosen value of N needs to be lower. For example, to match v¯ = 2.0 annual with our benchmark γ = 0.6 requires N = 30 months, and to match v¯ = 4.0 annual with γ = 0.6 requires N = 15 months. IV.B. The Response of Velocity to U.S. Money and Consumption Shocks We now study the implications of our model for velocity in the short run when we feed in the money growth and aggregate consumption shocks observed in the U.S. data. We use monthly data on M2 as our measure of the monetary aggregate Mt , and we use monthly data on the deviation of the log of real personal consumption expenditure from a linear trend as our measure of the shocks to aggregate endowment yt . To solve for households’ decision rules in the model, we estimate a VAR relating the current money growth rate and aggregate consumption to twelve lags of these variables and use this VAR as the stochastic process governing the exogenous shocks. We then generate the model’s implications for velocity by feeding in the actual series for these shocks. To compare the implications of our model for the dynamics of money and velocity in the short run to the data, we detrend the series implied by the model using the HP-filter. Consider the implications of our model with N = 38 months and γ = 0.6. In Figure III, we show the HP-filtered series for velocity implied by our model with the corresponding HP-filtered series for velocity from the data. The correlation between velocity in the model and the data is .6. In the figure, we have used different scales in plotting the series from the model and the data.
946
QUARTERLY JOURNAL OF ECONOMICS
FIGURE III Model and Data Velocity Results from feeding money growth and endowment shocks as measured in monthly U.S. data into the N = 38, γ = 0.6 model. Money growth shocks are demeaned M2 growth; endowment shocks are the deviations of real personal consumption expenditure from a linear trend. To solve for households’ decision rules in the model, we estimate a VAR relating the current M2 growth rate and real personal consumption expenditure growth rate to 12 lags of these variables and use this VAR as the stochastic process governing the exogenous shocks. All variables are reported in logs as deviations from an HP trend with smoothing parameter 34 × 1,600.
These different scales reflect the fact that the standard deviation of velocity in the model is only 40% of the standard deviation of velocity in the data. Given that we have used nothing but steady-state information to choose the parameters of this model, we regard the high correlation between velocity from the model and the data as a remarkable success. Observe that if we had chosen N = 1, as in a standard cash-in-advance model, velocity as implied by the model would be constant at one regardless of the shock process and, hence, the correlation between velocity in the model and velocity in the data would be zero. We interpret this finding as offering support for the hypothesis that a substantial portion of the negative correlation between the short-run movements of velocity and the ratio of money to consumption is due to the endogenous response of velocity to changes in the ratio of money to consumption.
SLUGGISH RESPONSES OF PRICES AND INFLATION
947
We obtain broadly similar results with the alternative values of N and γ discussed above. For example, if we have γ = 0.7 but increase N to 52 to keep v¯ = 1.5 annual, then the correlation of HP-filtered velocity implied by our model and HP-filtered velocity in the data is still .51 (down from 0.60 for the benchmark parameters), whereas the standard deviation of velocity in the model rises slightly, to 45% of the standard deviation of velocity in the data. If instead we keep γ = 0.6 but choose a lower N = 30 to match a higher velocity of v¯ = 2.0 annual, then the correlation of model and data velocity is .56, almost the same as in the benchmark, but the standard deviation of velocity in the model falls to 32% of the data. Similarly, if we choose N = 15 to match even higher velocity of v¯ = 4.0 annual, then the correlation of model and data velocity falls slightly further to .48, whereas the standard deviation of velocity in the model falls to 21% of the data. Reducing N to match the higher velocities implied by narrower monetary aggregates impairs the ability of the model to endogenously produce volatile velocity, but does not substantially alter the correlation between data and model velocity. IV.C. The Response to a Shock to the Interest Rate We now consider the response of inflation to a shock to the nominal interest rate. A large literature estimates the response of the macroeconomy to a monetary policy shock modeled as a shock to the Federal Funds rate. The consensus in this literature is that a monetary policy shock is associated with a persistent increase in the short-term nominal interest rate, a persistent decrease in the money supply, and, at least initially, little or no response in the price level (Christiano, Eichenbaum, and Evans 1999).13 To simulate the effects of a monetary policy shock, we solve for a money growth path consistent with an exogenous, persistent movement in the short-term nominal interest rate. This raises two technical issues. First, recall from Proposition 1 that there is an indeterminacy in this model if the nominal interest rate is exogenous. In equilibrium, there are many paths for money growth, all consistent with the same exogenously specified path for nominal interest rates.14 In the quantitative experiment below, we resolve 13. See Cochrane (1994), Leeper, Sims, and Zha (1996), Christiano, Eichenbaum, and Evans (2005), and Uhlig (2005) for additional examples of such estimates. 14. The indeterminacy result of Section III is for u(c) = log(c) and γ = 0 but extends to the case of general isoelastic preferences and γ > 0.
948
QUARTERLY JOURNAL OF ECONOMICS
this indeterminacy in the same way that we did in Section III. We choose the unique path for money growth that, on impact, leaves the price level unchanged. A second technical issue is that in this model the endogenous dynamics with an exogenous nominal interest rate last exactly N periods. The matrix describing the equilibrium dynamics of endogenous variables has its N eigenvalues all exactly equal to zero. This implies that, if the interest rate is set at its steady-state value but the initial distribution of money holdings is not, then steady state will be reached in exactly N periods. The repetition of the eigenvalues also implies that the matrix that described equilibrium dynamics is not diagonalizable, and hence, this model cannot be solved using standard methods such as those outlined by Blanchard and Kahn (1980) or Uhlig (1999). In an Online Technical Appendix to this paper, we develop a specific solution method for this model based on the use of the generalized Schur form that makes use of the information that the eigenvalues of the matrix describing equilibrium dynamics are all equal to zero.15 We now study the quantitative implications of our model with N = 38 and γ = 0.6, having solved for money growth consistent with the log of the short-term gross interest rate following an AR(1) process with persistence ρ = 0.87. This persistence produces a response of the nominal interest rate to a monetary policy shock similar to that estimated by Christiano, Eichenbaum, and Evans (1999). Figure IV shows the impulse responses of inflation, money growth, and velocity growth following a persistent increase in the nominal interest rate. The model produces a persistent liquidity effect both in the sense that an increase in the nominal interest rate is associated with a fall in money growth and in the sense that an increase in the nominal interest rate is associated, at least initially, with an increase in the real interest rate of roughly the same size. Although it is not plotted separately, the real interest rate in this figure can be read as the difference between the impulse response of the nominal interest rate and the impulse response for inflation. As is clear in the figure, the response of the real interest 15. This Online Technical Appendix is available at http://pages.stern.nyu .edu/∼cedmond/. We also found that direct methods based on use of the generalized Schur form, as suggested by Klein (2000) and others, did not correctly identify that the matrix describing equilibrium dynamics had eigenvalues all equal to zero. This appears to be a numerical issue, because this methodology should work in cases with repeated eigenvalues.
SLUGGISH RESPONSES OF PRICES AND INFLATION
949
FIGURE IV Large Liquidity Effects Impulse responses of money growth, inflation, and velocity growth to a persistent nominal interest rate shock in the monthly N = 38, γ = 0.6 model. A unique equilibrium process for money growth is identified by selecting the one consistent with no movement in the price level on impact. All variables are reported as percent deviations from steady state.
rate to the change in the nominal interest rate is quite persistent, and, as a result, inflation is persistently sluggish, responding only slowly to the increase in the nominal interest rate. Figure V shows the same impulse responses, but for the levels of the variables rather than their growth rates. The aggregate price level appears “sticky,” showing little or no response to the shock to interest rates for at least the first twelve months. It is only after twelve months have passed that the money stock and the price level begin to rise together in the manner that would be expected in a flexible price model following a persistent increase in the nominal interest rate. This slow response of the price level simply reflects the persistently sluggish response of inflation. This quantitative exercise indicates that our model can account for a substantial delay in the response of inflation to an
950
QUARTERLY JOURNAL OF ECONOMICS
FIGURE V Sluggish Price Response to Persistent Interest Rate Shock Impulse responses of the money supply, price level, and velocity to a persistent nominal interest rate shock in the monthly N = 38, γ = 0.6 model. A unique equilibrium process for money growth is identified by selecting the one consistent with no movement in the price level on impact. All variables are reported as percent deviations from steady state.
exogenous shock to the nominal interest rate, and it does so because of the persistent response of the real interest rate to the change in the nominal interest rate. V. CONCLUSIONS In this paper, we have put forward a simple inventorytheoretic model of the demand for money and have shown, in that model, that the price level does not respond immediately to an exogenous increase in the money supply and that expected inflation does not respond immediately to an exogenous increase in the nominal interest rate. Instead, there is an extended period of price sluggishness that occurs because the exogenous increase in the money supply leads, at least initially, to an endogenous decrease in the velocity of money and an extended period of inflation
SLUGGISH RESPONSES OF PRICES AND INFLATION
951
sluggishness that occurs because of asset market segmentation. We have argued that if this simple model is used to analyze the dynamics of money and velocity using a relatively broad measure of money, then it produces sluggish responses of the price level and inflation similar to that estimated in the VAR literature for the response of the economy to monetary policy shocks. In keeping this model simple, we have abstracted from a number of issues that might play an important role in the development of a more complete model. First, we have simply assumed that households have the opportunity to transfer funds between their brokerage and bank accounts only every N periods and have not allowed households to alter the timing of these transactions after paying some fixed cost. This simplifying assumption allowed us to characterize equilibrium in an analytically tractable specification of our model. A model with explicit consideration of fixed costs of money transfers between accounts must be computed numerically. For work along these lines, see Khan and Thomas (2007). In their benchmark calibrated example, they find that these costs substantially reinforce the sluggishness of prices and the persistence of liquidity effects relative to that seen in our model.16 Second, we have simply assumed that output is exogenous in order to focus on the impact of monetary policy on prices and inflation. The impact of monetary policy shocks on output in a version of our model in which production is endogenous is an important area for future research. We have shown that monetary policy shocks have a direct impact on real asset prices in general and on real interest rates in particular. In a model with endogenous production, these changes in real asset prices would induce firms and workers to shift production and investment through time. The specific results that would be obtained would clearly depend on the exact specification of the production structure of the model. In recent work, Edmond (2003) and King and Thomas (2007) have begun to consider such models. There is a large literature that looks to model the sluggish responses of prices and inflation in an alternative framework in which prices are sticky because firms adjust prices infrequently.17 16. As Khan and Thomas (2007) emphasize, this result is sensitive to the shape of the idiosyncratic distribution of fixed costs facing households. The reason for this sensitivity is a “selection effect” familiar from models of price setting subject to menu costs. 17. This literature includes models in which firms set prices according to time-dependent rules (Fischer 1977; Taylor 1980; Rotemberg 1982; Calvo 1983) or
952
QUARTERLY JOURNAL OF ECONOMICS
Our results on the sluggish responses of prices to changes in money and of inflation to changes in the nominal interest rate arise from theoretical mechanisms that are unrelated to firms’ price-setting decisions. Moreover, the empirical phenomena that motivate our study are also unrelated to the extent of nominal rigidities. Consider first our results on the sluggish response of prices to changes in the stock of money. In our model, prices respond sluggishly to changes in money because nominal expenditure responds sluggishly to changes in money—velocity, which is the ratio of nominal expenditure to money, falls when money rises. The response of nominal expenditure to a change in the stock of money is a feature of money demand, not of the extent of nominal rigidities in terms of firms’ price-setting decisions. For example, if one posits money demand that is interest-inelastic as part of a sticky price model, then nominal expenditure will respond one for one with the stock of money regardless of the extent of nominal rigidities assumed in the model. Thus, modeling money demand in our way in a sticky price setup—where changes in nominal demand become changes in real output—implies that a given money supply shock has a smaller real effect on impact but a more persistent real effect than that obtained using an otherwise standard specification of money demand. Researchers using sticky price models may find it useful to incorporate our model of money demand when they look to account for the impact of a change in the stock of money on the economy. It is clear from our Figure I that this sluggish response of nominal expenditure to money is an important component of understanding the dynamics of prices and money in the unconditional U.S. data. VAR results in Altig et al. (2004) indicate that nominal expenditure also responds sluggishly to a shock to monetary policy. Consider next the relationship between our results and the sluggish response of inflation to changes in the nominal interest rate relative to those in sticky price models. Our model is able to produce a sluggish response of inflation to a persistent shock to the nominal interest rate due to the segmentation of asset markets. The money injections that implement a persistent change in the nominal interest rate also lead to a persistent change in the real state-dependent rules (Caplin and Leahy 1991; Dotsey, King, and Wolman 1999; Midrigan 2006; Golosov and Lucas 2007) or, more recently, on the basis of slowly updated information (Mankiw and Reis 2002; Woodford 2003a).
SLUGGISH RESPONSES OF PRICES AND INFLATION
953
interest rate of nearly the same magnitude. Sluggish inflation then follows directly, not as a consequence of sticky prices, but instead as a consequence of the standard Fisher equation linking nominal interest rates, real interest rates, and inflation. In contrast, standard sticky price models have serious problems in reproducing the estimated responses of inflation to a shock to monetary policy modeled as a persistent shock to the nominal interest rate. Mankiw (2001), for example, discusses how a standard sticky price model predicts that the largest response of inflation to a persistent shock to the nominal interest rate occurs on impact, and not in a delayed fashion. He uses this observation to argue for a model with “sticky information.” Sims (1998) makes a similar argument. The difficulty that sticky price models face in generating sluggish inflation arises from the fact that standard sticky price models build on a representative household framework linking the real interest rate to the growth of marginal utility for the representative household, and hence aggregate consumption, through a consumption Euler equation. Thus, in these models, if expected inflation responds sluggishly to a change in the nominal interest rate, then the growth rate of marginal utility for the representative household must respond strongly to a change in the nominal interest rate. Hence, capturing simultaneously a sluggish response of expected inflation and aggregate consumption to a change in the short-term nominal interest rate has been a challenge for these models. Frontier sticky price models, such as Christiano, Eichenbaum, and Evans (2005), use time nonseparable preferences and an elaborate set of adjustment costs and shocks to help their models reproduce a specific set of impulse responses, including the sluggish response of inflation. Canzoneri, Cumby, and Diba (2007), however, observe that standard sticky price models equate the nominal interest rate targeted by the central bank with the interest rate implied by the representative household’s consumption Euler equation and that this assumption fails quite dramatically in the data even if one considers a wide array of time nonseparable preferences for the household. They find a negative correlation between the Federal Funds rate in the data and the short-term nominal interest rates implied by a wide variety of sticky price models’ consumption Euler equations. By contrast, our model abandons the assumption of a representative household for pricing assets. In our model, the real interest rate is linked to the growth of marginal utility for active
954
QUARTERLY JOURNAL OF ECONOMICS
households, not for a representative household consuming aggregate consumption. Hence, as we have seen in our model, we can produce a sluggish response of expected inflation to a change in the nominal interest rate even if aggregate consumption is constant and hence has no response at all to a change in the nominal interest rate. Researchers using models with nominal rigidities may find it useful to incorporate asset market segmentation of the kind we examine here in their models in addressing some of the difficulties their models have with the consumption Euler equation. APPENDIX A. Data All data are monthly 1959:1–2006:12 and seasonally adjusted. We measure the price level P as the personal consumption expenditures chain-type price index with a base year of 2000 from the Bureau of Economic Analysis (BEA). We measure real consumption c as personal consumption expenditure on nondurables and services from the BEA deflated by P. We measure the money supply M as the M2 stock from the Board of Governors of the Federal Reserve System. We define velocity as v ≡ Pc/M. Here we document the robustness of the negative correlation between log(M/c) and log(v) using alternative detrending methods to characterize the short-run fluctuations in money and velocity. We report statistics for HP-filtered data based on the smoothing parameter λ = 1,600 ×34 recommended by Ravn and Uhlig (2002) for monthly data. These are the statistics reported in the main text. In Table A.1, we also report statistics for the lower smoothing parameter λ = 1,600 ×32 and for monthly differences and annual differences. No matter how the short-run fluctuations are measured, we find that there is a pronounced negative correlation between log(M/c) and log(v) and that the standard deviation of log(v) is almost as high as or higher than the standard deviation of log(M/c). We measure the opportunity costs of monetary assets using data collected by the Monetary Services Index project of the Federal Reserve Bank of St. Louis. We measure the opportunity cost of an asset as the short-term Treasury rate less the own rate of return on the asset in question. We take the short-term Treasury rate and own rates of return on currency and demand deposits from the spreadsheet ADJSAM.WKS available from the website
955
SLUGGISH RESPONSES OF PRICES AND INFLATION TABLE A.1 SHORT-RUN CORRELATION OF MONEY AND VELOCITY HP-filtered 1,600
×32
−.91 1.25
Correlation Standard deviation
Differenced
1,600
×34
Monthly
Annual
−.88 1.01
−.63 0.98
−.86 1.33
Note: Correlation of and relative standard deviation of velocity v to money/real consumption M/c for alternative measures of short-run fluctuations. We measure the money supply M as the M2 stock from the Board of Governors of the Federal Reserve System. We measure real consumption c as personal consumption expenditure on nondurables and services from the Bureau of Economic Analysis deflated by the personal consumption expenditures chain-type price index P from the BEA. We define velocity as v ≡ Pc/M. All data are monthly 1959:1–2006:12 and seasonally adjusted. All variables are reported in logs.
TABLE A.2 OPPORTUNITY COSTS OF MONETARY ASSETS
Currency Demand deposits M2
1959–2006
1959–1990
1990–2006
4.91 1.80 2.08
5.61 2.25 2.30
3.45 0.85 1.64
Note. The opportunity costs of monetary assets in percentage points. We measure opportunity costs as the short-term Treasury rate less the own rate of return on the asset in question. We take the short-term Treasury rate and own rates of return on currency and demand deposits from the Monetary Services Index project of the Federal Reserve Bank of St. Louis. We take the own rate of return on M2 from the Board of Governors of the Federal Reserve System. All data are monthly 1959:1–2006:2 and seasonally adjusted.
of the Federal Reserve Bank of St. Louis. We take the own rate of return on M2 from the Board of Governors of the Federal Reserve System. All opportunity cost data are monthly 1959:1–2006:2 and seasonally adjusted. As is clear from Table A.2, the average opportunity cost of holding demand deposits and M2 is roughly similar, on the order of 200 basis points. Both opportunity costs have fallen somewhat in recent years. B. Algebra of Steady-State Money Distribution and Elasticities Let the length of a period be > 0, measured in fractions of a year. Let the length of time between periods of activity be T , such that the number of periods between periods of inactivity is N = T /. Let period utility be u(c) = log(c) and set the paycheck parameter to γ = 0. In this setting, individual velocity in period t is time-invariant and given by v(s) = (1 − β )/(1 − β (N−s) ) for s = 0, 1, . . . , N − 1.
956
QUARTERLY JOURNAL OF ECONOMICS
For households s = 1, . . . , N − 1, the distribution of money holdings satisfies Mt (s) Mt−1 (s − 1) 1 = [1 − v(s − 1)] , Mt Mt μ t
(36)
with money market clearing implying that N−1 1 Mt−1 (s − 1) 1 1 Mt (0) =1− [1 − v(s − 1)] . N Mt N Mt μ t
(37)
s=1
Now consider a steady state with μt = μ. ¯ Iterating on the steady-state version of (36) and using the formula for individual velocity shows that the steady-state money holdings of household s are related to the holdings of an active household by s−1 1 M(0) M(s) = s . (1 − v(i)) M μ¯ M
(38)
i=0
And because s−1 s−1 1 − β (N−i−1) 1 − β (N−s) (1 − v(i)) = β = β s , (N−i) 1 − β N 1−β i=0 i=0
we have (39)
M(s) = M
s β 1 − β (N−s) M(0) . μ¯ 1 − β N M
We now need to find M(0)/M. We do this using steady-state money market clearing, N−1 1 M(s) 1 M(0) =1− N M N M s=1
(40)
=1−
1 N
N−1 s=1
M(0) M
s β 1 − β (N−s) , μ¯ 1 − β N
and so (41)
N−1 1 M(0) β s 1 − β (N−s) 1= . N M μ¯ 1 − β N s=0
SLUGGISH RESPONSES OF PRICES AND INFLATION
957
Computing the sums and rearranging gives the solution (42)
−1 1 M(0) ¯ N 1 − (β/μ) ¯ N N N 1 − (1/μ) = (1 − β ) −β . N M 1 − (β/μ) ¯ 1 − (1/μ) ¯
Plugging this formula for M(0)/M into equation (39) gives the complete solution for the steady-state distribution of money holdings. Steady-state aggregate velocity at an annual rate is then given by (43)
v¯ =
N−1 1 M(s) . v(s) N M s=0
We can use the formula for individual velocity in each period to simplify the terms in the sum. For each s we have s M(s) β 1 1 − β 1 − β (N−s) M(0) v(s) = M 1 − β (N−s) μ¯ 1 − β N M s β 1 1−β M(0) = . 1 − β N μ¯ M And so, using the formula for M(0)/M given in equation (42) and then summing over s, we have (44)
−1 ¯ N 1 − (β/μ) ¯ 1 − β N 1 − (1/μ) 1−β . v¯ = 1 − (1/μ) ¯ 1 − (β/μ) ¯ N
To develop intuition, we simplify these formulas by studying a steady state with μ¯ = 1 in the limit as β → 1. We begin with the further special case of = 1 month so that we can quickly derive the main formulas used in the text and then return to the case of general > 0 at the end. With this extra structure, the steady-state money holdings of household s are related to the holdings of an active household by N − s M(0) M(s) = . M N M And so, on using this formula in money market clearing, we also get M(0)/M = 2N/(N + 1), so that we have the complete solution for the distribution of money holdings, (45)
M(s) N−s =2 , M N+1
958
QUARTERLY JOURNAL OF ECONOMICS
for s = 0, 1, . . . , N − 1. Steady-state aggregate velocity is then v=
(46)
N−1 N−1 1 M(s) 1 2 N−s 2 v(s) = = , N M N N−s N+1 N+1 s=0
s=0
as used in the main text. Continuing with this special case of = 1 month, we now derive the elasticity of aggregate velocity with respect to money growth. Specifically, using money market clearing and the law of motion for the money holdings, we have N−1 1 Mt (s) v(s) vt = N Mt s=0 N−1 N−1 1 Mt (s) Mt (s) 1 = v(0) 1 − v(s) + N Mt N Mt s=1
= v(0) + = v(0) +
1 N 1 N
N−1
[v(s) − 1]
s=1 N−1
s=1
Mt (s) Mt
[v(s) − 1][1 − v(s − 1)]
s=1
Mt−1 (s − 1) 1 . Mt−1 μt
And so (47)
vt μt = v(0)μt +
N−1 1 Mt−1 (s − 1) [v(s) − 1][1 − v(s − 1)] , N Mt−1 s=1
which gives the key result (48)
∂ (vt μt ) = v(0), ∂μt
which is a constant for all t. Using the product rule ∂(vt μt )/∂μt = (∂vt /∂μt )μt + vt , we can solve for the elasticity in terms of v(0), a known constant, and aggregate velocity. We evaluate this elasticity at steady state vt = v¯ to get (49)
v(0) 1 N−1 ∂ log(v) = −1=− . ∂ log(μ) v 2 N
And because the aggregate endowment y is constant, the elasticity of inflation with respect to money growth evaluated at steady
SLUGGISH RESPONSES OF PRICES AND INFLATION
959
state is (50)
∂ log(π ) ∂ log(v) v(0) 1 N+1 = +1= = . ∂ log(μ) ∂ log(μ) v 2 N
We now derive the elasticity of the share of money held by active households with respect to money growth. Multiplying equation (37) by Mt and differentiating both sides with respect to Mt we get ∂ Mt (0) = N. ∂ Mt Evaluated at steady state, M N+1 N+1 ∂ log(M(0)) =N =N = . ∂ log(μ) M(0) 2N 2 Now let m(0) ≡ M(0)/M denote the steady-state money share. Then we have (51)
∂ log(m(0)) ∂ log(M(0)) N+1 N−1 = −1= −1= . ∂ log(μ) ∂ log(μ) 2 2
To obtain the expressions with arbitrary used in the main text, set N = T / in equations (49)–(51). More formally, use the expression for v¯ in equation (44) and calculate the limit as β/μ¯ → 1 using l’Hˆopital’s rule. C. Dynamic Response of Velocity to a Money Growth Shock Here we analytically characterize the impulse response of velocity to a money growth shock. The dynamics of velocity following a money growth shock are determined by the subsequent evolution of the distribution of money over time. It is easiest to analyze the dynamics of velocity following a shock in a log-linearized version of the model. We proceed in two steps. First, we provide an autoregressive moving average (ARMA) representation of the dynamics of the money distribution. Second, we map the ARMA representation into a formula for the impulse response of velocity that is exact (up to the log-linearization) for the first N − 1 periods after a shock. For simplicity, we consider only the special case of a period length = 1 month. Two sets of equations govern the dynamics of the distribution of money. First, there is an equation requiring that the sum of the log deviations of the fractions of money held by agents of type s be
960
QUARTERLY JOURNAL OF ECONOMICS
zero, 0 = m(0)m ˆ t (0) +
N−1
m(s)m ˆ t (s),
s=1
where steady-state money shares are m(s) ≡ M(s)/M and m ˆ t (s) ≡ log[mt (s)/m(s)]. Second, there is a set of equations for s = 1, . . . , N − 1 governing the evolution of the money shares, ˆ t−1 (s − 1) − μˆ t , m ˆ t (s) = m where these equations follow from the fact that individual velocities v(s) are time-invariant. Rearranging the first equation and using m(s) = 2(N − s)/(N + 1), we have for active households m ˆ t (0) = −
N−1 s=1
N−s m(s) m ˆ t (s) = − m ˆ t (s), m(0) N N−1
s=1
and after iterating on the transitions for inactive households m ˆ t (s) = m ˆ t−s (0) −
s
μˆ t−k+1 ,
k=1
for s = 1, . . . , N − 1. Combining these gives an ARMA representation of the dynamics of the money distribution: m ˆ t (0) = −
N−1 s=1
N−1 s N−s N−s m ˆ t−s (0) + μˆ t−k+1 . N N s=1
k=1
The log deviation of velocity can be written N−1 1 vˆt = m ˆ t (s), N s=0
using v(s)m(s) = 2/(N + 1) = v¯ for all s. Differencing this once and simplifying gives N−1 N−1 1 1 vˆt = m ˆ t (s) = ˆ t−N (0) − (N − 1)μˆ t + μˆ t−s , m ˆ t (0) − m N N s=0
s=1
which repeatedly uses m ˆ t−1 (s − 1) = m ˆ t (s) + μˆ t to cancel terms in the sum. Let the economy start in steady state for t < 0 and consider a given shock μˆ t at date t with μˆ t+k = 0 for all k > 0. For
SLUGGISH RESPONSES OF PRICES AND INFLATION
961
the first N − 1 periods after a shock, the terms m ˆ t−N (0) and the N−1 μˆ t−s are zero, so that vˆt = [m ˆ t (0) − (N − 1)μˆ t ]/N. We sum s=1 can solve this for m ˆ t (0) = Nvˆt + (N − 1)μˆ t and use the ARMA representation for the money share of active households to get an ARMA representation of velocity growth that is exact for the first N − 1 periods, vˆt = −
N−1 s=1
N−s 1 N−1 vˆt−s − μˆ t N 2 N
(using μˆ t−s = 0 for the first N − 1 periods). Rearranging terms to write this in levels, we get vˆt =
N−1 1 1 N−1 μˆ t vˆt−s − N 2 N s=1
(this time using vˆt−N = 0 for the first N − 1 periods). When N is large, so that (N − 1)/N ≈ 1, this implies that the impulse response of the log of velocity over the first N − 1 periods is given by 1 k+1 1 1+ − 1. (52) vˆt+k = 2 N This starts with vˆt = −1/2; for large N it crosses zero at roughly k = N log(2) and then rises above zero until k = N. D. Proof of Indeterminacy Proposition Using that u(c) = log(c) and γ = 0, so that Pt ct (0) = v(0)Mt (0), and that u (ct (0)) 1 1 = = , Pt ct (0) Pt v (0) Mt (0)
∞ the sequence of Mt (0) that supports the interest rate it∗ t=0 must satisfy Mt+1 (0) = 1 + it∗ β, Mt (0)
t = 0, 1, . . . ,
or (53)
Mt+1 (0) = M0 (0) β t
t
1 + i ∗j . j=0
962
QUARTERLY JOURNAL OF ECONOMICS
For future reference, we can write equation (53) as
Mt−1−s (0) = M0 (0) β t−1−s−1
(54)
t−1−s−1
1 + i ∗j ,
j=0
which applies if t − 1 − s ≥ 0 or s ≤ t − 1. Now again using that u (c) = log (c) and γ = 0, we have Mt (s) = (1 − v (s − 1)) Mt−1 (s − 1) ,
s = 1, . . . , N,
which we can substitute into
Mt (0) = NMt −
N−1
(1 − v (s − 1)) Mt (s − 1)
s=1
to obtain
(55)
Mt (0) = N ( Mt − Mt−1 ) +
N−1
θs Mt−1−s (0) ,
s=0
where the coefficients θs are given by θs ≡ v(s)[ s−1 j=0 (1 − v( j))] > 0. It is easy to verify that any sequence of {Mt − Mt−1 } for t ≥ 0 and {Mt (0)} for t ≥ −N + 1 that solves equation (55) completely characterizes an equilibrium. Now we specialize equation (55) for three different types of time periods. For t = 0 we have
(56)
N−1 ∗ ∗ M0 (0) = N M0 − M−1 + θs M−1−s (0) . s=0
For t = 1, 2, . . . , N − 1 we can break the sum into two parts and use the expression for Mt−1−s (0) in terms of interest rates,
SLUGGISH RESPONSES OF PRICES AND INFLATION
963
equation (54), so we have Mt (0) t−1
= N ( Mt − Mt−1 ) +
θs Mt−1−s (0) +
= N ( Mt − Mt−1 ) +
θs M0 (0) β t−1−s−1
t−1−s−1
s=0
(57)
+
N−1
∗ θs Mt−1−s (0)
s=t
s=0 t−1
N−1
1 + i ∗j
j=0
∗ θs Mt−1−s (0) ,
s=t
and using the expression for the interest rate equation (53) again, M0 (0) β t−1
t−1
1 + i ∗j j=0 t−1
= N ( Mt − Mt−1 ) +
θs M0 (0) β t−1−s−1
t−1−s−1
s=0
(58)
+
N−1
1 + i ∗j
j=0
∗ θs Mt−1−s (0) .
s=t
Finally, for t = N, N + 1, . . ., we have Mt (0) = N ( Mt − Mt−1 ) +
N−1
θs M0 (0) β t−1−s−1
s=0
t−1−s−1
1 + ij ,
j=0
and inserting the expression for Mt (0) based on the interest rates, M0 (0) β t−1
t−1
1 + i ∗j j=0
(59)
= N ( Mt − Mt−1 ) +
N−1 s=0
θs M0 (0) β t−1−s−1
t−1−s−1
1 + ij .
j=0
Now we are ready to construct the path of the remaining
∞ variables for an equilibrium that supports the interest rate path it∗ t=0 . We do this in three steps, one for each type of time period. We do this for an arbitrary value of M0 .
964
QUARTERLY JOURNAL OF ECONOMICS
Step a. Solve for M0 (0) . For t = 0, M0 (0) is a function of prede∗ termined variables, M−1 , M∗j (0) for j < 0, and M0 . Thus, for the given value of M0 there is a unique value of M0 (0) . Step b. Solve for Mt (0) and Mt for t = 1, . . . , N − 1. Equation (58) gives one equation in one unknown, namely Mt − Mt−1 , given M0 (0) . Using these equations recursively, using the initial conditions M0 found in Step a, we can solve for M1 , . . . , MN−1 . Step c. Solve for Mt for t ≥ N. Given the initial condition MN−1 found in Step b, equation (59) can be used to solve for Mt for t ≥ N. Steps a through c show that for any given M0 there is a unique way to construct an equilibrium that supports the path of interest
∞ rates it∗ t=0 . We now show that any equilibrium that supports the in for ∞ terest rate sequence it∗ t=0 , the distribution of cash Mt (s) /Mt for s = 0, . . . , N − 1 for all t ≥ N is the same. Using equation (53) for t ≥ N in Mt (0) = NMt −
N−1
(1 − v (s − 1)) Mt (s − 1) ,
s=1
we obtain Mt (0) = NMt −
N−1
(1 − v (s − 1))
s=1
s−1
v (k) Mt−k (0) ,
k=1
and using equation (53) we get M0 (0) β t−1
t−1
1 + i ∗j j=0
= NMt −
N−1 s=1
(1 − v (s − 1))
s−1 k=1
v (k) M0 (0) β t−k−1
t−k−1
1 + i ∗j ,
j=0
which shows that the path of Mt is proportional to M0 (0) for t ≥ N. Finally, equation (53) implies that the path of Mt (s) is proportional to M0 (0) , which establishes the desired result. This in turn immediately implies that Mt (s) /Mt = Mt∗ (s) /Mt∗ and Mt+1 /Mt = ∗ ∗ /Mt∗ , and thus that ct (s) = ct∗ (s) Pt+1 /Pt = Pt+1 /Pt∗ for t ≥ N. Mt+1
SLUGGISH RESPONSES OF PRICES AND INFLATION
965
Finally, the qualification that M0 has to be close to M0∗ ensures that in the values constructed for Mt (0) during the periods t = 0, . . . , N − 1 are all strictly positive. UNIVERSITY OF CHICAGO AND NATIONAL BUREAU OF ECONOMIC RESEARCH UNIVERSITY OF CALIFORNIA–LOS ANGELES, FEDERAL RESERVE BANK OF MINNEAPOLIS, AND NATIONAL BUREAU OF ECONOMIC RESEARCH NEW YORK UNIVERSITY AND UNIVERSITY OF MELBOURNE
REFERENCES Abel, Andrew B., Janice C. Eberly, and Stavros Panageas, “Optimal Inattention to the Stock Market,” American Economic Review, 97 (2007), 244–249. Altig, David, Lawrence J. Christiano, Martin Eichenbaum, and Jesper Linde, “Firm-Specific Capital, Nominal Rigidities and the Business Cycle,” Federal Reserve Bank of Cleveland Working Paper No. 04-16, 2004. Alvarez, Fernando, and Andrew Atkeson, “Money and Exchange Rates in the Grossman–Weiss–Rotemberg Model,” Journal of Monetary Economics, 40 (1997), 619–640. Alvarez, Fernando, Andrew Atkeson, and Patrick J. Kehoe, “Money, Interest Rates, and Exchange Rates with Endogenously Segmented Markets,” Journal of Political Economy, 110 (2002), 73–112. ——, “Time-Varying Risk, Interest Rates, and Exchange Rates in General Equilibrium,” Federal Reserve Bank of Minneapolis Research Department Staff Report No. 371, 2007. Alvarez, Fernando, Robert E. Lucas, Jr., and Warren E. Weber, “Interest Rates and Inflation,” American Economic Review, 91 (2001), 219–225. Barro, Robert J., “Integral Constraints and Aggregation in an Inventory Model of Money Demand,” Journal of Finance, 31 (1976), 77–88. Baumol, William J., “The Transactions Demand for Cash: An Inventory Theoretic Approach,” Quarterly Journal of Economics, 66 (1952), 545–556. Blanchard, Olivier Jean, and Charles M. Kahn, “The Solution of Linear Difference Models under Rational Expectations,” Econometrica, 48 (1980), 1305–1311. Calvo, Guillermo A., “Staggered Prices in a Utility-Maximizing Framework,” Journal of Monetary Economics, 12 (1983), 383–398. Canzoneri, Matthew B., Robert E. Cumby, and Behzad T. Diba, “Euler Equations and Money Market Interest Rates: A Challenge for Monetary Policy Models,” Journal of Monetary Economics, 54 (2007), 1863–1881. Caplin, Andrew, and John Leahy, “State-Dependent Pricing and the Dynamics of Money and Output,” Quarterly Journal of Economics, 106 (1991), 683–708. Chatterjee, Satyajit, and Dean Corbae, “Endogenous Market Participation and the General Equilibrium Value of Money,” Journal of Political Economy, 100 (1992), 615–646. Chiu, Jonathan, “Endogenously Segmented Asset Market in an InventoryTheoretic Model of Money Demand,” Bank of Canada Working Paper No. 2007-46, 2007. Christiano, Lawrence, Martin Eichenbaum, and Charles Evans, “Monetary Policy Shocks: What Have We Learned and to What End?” in Handbook of Macroeconomics, Michael Woodford and John Taylor, eds. (Amsterdam: North-Holland, 1999). ——, “Nominal Rigidities and the Dynamic Effects of a Shock to Monetary Policy,” Journal of Political Economy, 113 (2005), 1–45. Clark, Timothy, Astrid Dick, Beverly Hirtle, Kevin J. Stiroh, and Robard Williams, “The Role of Retail Banking in the U.S. Banking Industry: Risk, Return, and Industry Structure,” Federal Reserve Bank of New York Economic Policy Review, 13 (2007), 39–56. Cochrane, John H., “Shocks,” Carnegie-Rochester Conference Series on Public Policy, 41 (1994), 295–364.
966
QUARTERLY JOURNAL OF ECONOMICS
Dotsey, Michael, Robert G. King, and Alexander L. Wolman, “State-Dependent Pricing and the General Equilibrium Dynamics of Money and Output,” Quarterly Journal of Economics, 114 (1999), 655–690. Duffie, Darrell, and Tong-Sheng Sun, “Transactions Costs and Portfolio Choice in a Discrete-Continuous-Time Setting,” Journal of Economic Dynamics and Control, 14 (1990), 35–51. Edmond, Chris, “Sticky Prices versus Sticky Demand,” University of Melbourne, Working Paper, 2003. Federal Reserve Board, Flow of Funds Accounts of the United States (Washington, DC: Federal Reserve Board, 2007). Fischer, Stanley, “Long-Term Contracts, Rational Expectations, and the Optimum Money Supply Rule,” Journal of Political Economy, 85 (1977), 191–205. Golosov, Mikhail, and Robert E. Lucas, Jr., “Menu Costs and Phillips Curves,” Journal of Political Economy, 115 (2007), 171–199. Grossman, Sanford J., and Laurence Weiss, “A Transactions-Based Model of the Monetary Transmission Mechanism,” American Economic Review, 73 (1983), 871–880. Investment Company Institute, Equity Ownership in America (Washington, DC: Investment Company Institute, 2002). Jovanovic, Boyan, “Inflation and Welfare in the Steady State,” Journal of Political Economy, 90 (1982), 561–577. Khan, Aubhik, and Julia K. Thomas, “Inflation and Interest Rates with Endogenous Market Segmentation,” Federal Reserve Bank of Philadelphia Working Paper 07-1, 2007. King, Robert G., and Julia K. Thomas, “Breaking the New Keynesian Dichotomy: Asset Market Segmentation and the Monetary Transmission Mechanism,” Ohio State University, Working Paper, 2007. Klein, Paul, “Using the Generalized Schur Form to Solve a Multivariate Linear Rational Expectations Model,” Journal of Economic Dynamics and Control, 24 (2000), 1405–1423. Leeper, Eric M., Christopher A. Sims, and Tao Zha, “What Does Monetary Policy Do?” Brookings Papers on Economic Activity, 2 (1996), 1–78. Lucas, Robert E. Jr., “Asset Prices in an Exchange Economy,” Econometrica, 46 (1978), 1429–1445. Mankiw, N. Gregory, “The Inexorable and Mysterious Tradeoff between Inflation and Unemployment,” Economic Journal, 111 (2001), C45–C61. Mankiw, N. Gregory, and Ricardo Reis, “Sticky Information versus Sticky Prices: A Proposal to Replace the New Keynesian Phillips Curve,” Quarterly Journal of Economics, 117 (2002), 1295–1328. Midrigan, Virgiliu, “Menu Costs, Multi-Product Firms, and Aggregate Fluctuations,” Federal Reserve Bank of Minneapolis, Working Paper, 2006. Ravn, Morten O., and Harald Uhlig, “On Adjusting the Hodrick–Prescott Filter for the Frequency of Observations,” Review of Economics and Statistics, 84 (2002), 371–380. Romer, David, “A Simple General Equilibrium Version of the Baumol–Tobin Model,” Quarterly Journal of Economics, 101 (1986), 663–686. Rotemberg, Julio J., “Monopolistic Price Adjustment and Aggregate Output,” Review of Economic Studies, 49 (1982), 517–531. ——, “A Monetary Equilibrium Model with Transactions Costs,” Journal of Political Economy, 92 (1984), 40–58. Silva, Andr´e C., “Prices and Money after Interest Rate Shocks with Endogenous Market Segmentation,” Universidade Nova de Lisb˜oa, Working Paper, 2008. Sims, Christopher A., “Stickiness,” Carnegie–Rochester Series on Public Policy, 49 (1998), 317–356. Taylor, John B., “Aggregate Dynamics and Staggered Contracts,” Journal of Political Economy, 88 (1980), 1–23. Tobin, James, “The Interest-Elasticity of the Transactions Demand for Cash,” Review of Economics and Statistics, 38 (1956), 241–247. Uhlig, Harald, “A Toolkit for Analysing Nonlinear Dynamic Stochastic Models Easily,” in Computational Methods for the Study of Dynamic Economies, Ramon Marimon and Andrew Scott, eds. (New York: Oxford University Press, 1999).
SLUGGISH RESPONSES OF PRICES AND INFLATION
967
——, “What Are the Effects of Monetary Policy on Output? Results from an Agnostic Identification Procedure,” Journal of Monetary Economics, 52 (2005), 382–419. U.S. Department of Commerce, Bureau of Economic Analysis, National Income and Product Accounts of the United States (Washington, DC: various years). Vissing-Jorgensen, Annette, “Towards an Explanation of Household Portfolio Choice Heterogeneity: Nonfinancial Income and Participation Cost Structures,” National Bureau of Economic Research Working Paper No. 8884, 2002. Woodford, Michael, “Imperfect Common Knowledge and the Effects of Monetary Policy,” in Knowledge, Information, and Expectations in Modern Macroeconomics, Philippe Aghion, Roman Frydman, Joseph Stiglitz, and Michael Woodford, eds. (Princeton, NJ: Princeton University Press, 2003a). ——, Interest and Prices: Foundations of a Theory of Monetary Policy (Princeton, NJ: Princeton University Press, 2003b).
E-ZTAX: TAX SALIENCE AND TAX RATES∗ AMY FINKELSTEIN This paper examines whether the salience of a tax system affects equilibrium tax rates. I analyze how tolls change after toll facilities adopt electronic toll collection (ETC); drivers are substantially less aware of tolls paid electronically. I estimate that, in steady state, tolls are 20 to 40 percent higher than they would have been without ETC. Consistent with a salience-based explanation for this toll increase, I find that under ETC, driving becomes less elastic with respect to the toll and toll setting becomes less sensitive to the electoral calendar. Alternative explanations appear unlikely to be able to explain the findings.
I. INTRODUCTION For every dollar of revenue raised by the U.S. income tax system, taxpayers incur about ten cents in private compliance costs associated with record keeping and tax filing (Slemrod 1996). These compliance costs impose a deadweight burden on society. Yet policies that would reduce these costs are frequently opposed by policy makers and economists who believe that compliance costs play an important role in keeping taxes visible and salient to the electorate, who then serve as an important check on attempts to raise the scale of government activity beyond what an informed citizenry would want. For example, Milton Friedman has publicly lamented his inadvertent contribution to the growth of government by encouraging the introduction of the visibility-reducing Federal income tax withholding system during the Second World War (Friedman and Friedman 1998, p. 123). More recently, in 2005, the President’s Advisory Panel on Federal Tax Reform failed to reach consensus on whether to replace part of the existing income tax system with a value-added tax (VAT), in part because of concerns about how ∗ I am grateful to Daron Acemoglu, Gene Amromin, Pol Antras, ` David Autor, Raj Chetty, Peter Diamond, Liran Einav, Hanming Fang, Naomi Feldman, Edward Glaeser, Mike Golosov, Austan Goolsbee, Jerry Hausman, Larry Katz, Erzo Luttmer, Brigitte Madrian, Sean Nicholson, Ben Olken, Jim Poterba, Nancy Rose, Stephen Ryan, Monica Singhal, Heidi Williams, Clifford Winston, two anonymous referees, and seminar participants at Cornell, MIT, Berkeley, Stanford GSB, Yale, the NBER Public Economics Meeting, Harvard, and Stanford for helpful comments; to James Wang and especially Julia Galef for outstanding research assistance; to Tatyana Deryugina, Julia Galef, Stephanie Hurder, and Erin Strumpf for help in conducting the survey of toll awareness; and to the innumerable employees of toll operating authorities around the country who generously took the time to provide data and to answer my many questions.
C 2009 by the President and Fellows of Harvard College and the Massachusetts Institute of
Technology. The Quarterly Journal of Economics, August 2009
969
970
QUARTERLY JOURNAL OF ECONOMICS
the lower visibility of a VAT would affect the size of government. As the Advisory Panel noted in its report: [Some] Panel Members were unwilling to support the [VAT] proposal given the lack of conclusive empirical evidence on the impact of a VAT on the growth of government. Others were more confident that voters could be relied on to understand the amount of tax being paid through a VAT, in part because the proposal studied by the Panel would require the VAT to be separately stated on each sales receipt provided to consumers. These members of the Panel envisioned that voters would appropriately control growth in the size of the federal government through the electoral process. (The President’s Advisory Panel on Federal Tax Reform 2005, pp. 203–204)
The idea that a less visible tax system may fuel the growth of government can be traced back at least to John Stuart Mill’s 1848 Principles of Political Economy. It has its modern roots in the public choice tradition of “fiscal illusion.” In a series of influential books and articles, James Buchanan and co-authors have argued that citizens systematically underestimate the tax price of public sector activities, and that government in turn exploits this misperception to reach a size that is larger than an informed citizenry would want. The extent of the tax misperception—and thus the size of government—is in turn affected by the choice of tax instruments, with more complicated and less visible taxes exacerbating the extent of fiscal illusion and thereby increasing the size of the government (e.g., Buchanan [1967]; Buchanan and Wagner [1977]; Brennan and Buchanan [1980]). Empirical evidence of the impact of tax salience on tax rates, however, has proved extremely elusive. Most of the evidence comes from cross-sectional studies of the relationship between the size of government and the visibility of the tax system, where the direction of causality is far from clear (Oates 1988; Dollery and Worthington 1996). Moreover, as I discuss in more detail below, the sign of any effect of tax salience on tax rates is theoretically ambiguous. The link between tax salience and tax rates is therefore an open empirical question. In this paper, I examine the relationship between tax salience and tax rates empirically by studying the impact of the adoption of electronic toll collection (ETC) on toll rates. Electronic toll collection systems—such as the eponymous E-ZPass in the northeastern United States, I-Pass in Illinois, or Fast-Trak in California—allow automatic deduction of the toll as the car drives through a toll plaza. Because the driver need no longer actively count out and hand over cash for the toll, the toll rate may well be less salient
E-ZTAX: TAX SALIENCE AND TAX RATES
971
to the driver when paying electronically than when paying cash. Indeed, I present survey evidence that indicates a strikingly lower awareness of the amount paid in tolls by those who pay electronically relative to those who pay using cash. This discrepancy in toll awareness exists even among regular commuters on a toll facility. As a result, toll facilities’ adoption of ETC—and the resultant switch by many drivers to paying electronically—provides a setting in which to examine the impact of tax salience on tax rates. Different toll facilities in the United States have adopted ETC at different points in time over the last several decades, and some have not yet adopted it. To study the impact of ETC, I examine the within toll-facility changes in toll rates associated with the adoption and diffusion of ETC. To do so, I collected a new data set on the history of toll rates and ETC installation for 123 toll facilities in the United States. Where they were available, I also collected annual facility-level data on toll traffic, toll revenue, and the share of each that is paid by electronic toll collection. I find robust evidence that toll rates increase after the adoption of electronic toll collection. My estimates suggest that when the proportion of tolls paid using ETC has diffused to its steady state level of about 60 percent, toll rates are 20 to 40 percent higher than they would have been under a fully manual toll collection system. I also present evidence of two potential mechanisms by which reduced salience may contribute to increased toll rates. First, I find that the elasticity of driving with respect to the toll declines (in absolute value) with the adoption of electronic toll collection, suggesting that ETC may raise the optimal level of the toll. Second, I show that under ETC, toll-setting behavior becomes less sensitive to the local election calendar, suggesting that ETC may reduce the political costs of raising tolls. The rest of the paper proceeds as follows. Section II provides a conceptual framework for how tax salience may affect tax rates and the factors that may affect the (ambiguous) sign of this relationship. Section III presents evidence that tolls are less salient when paid by ETC than by cash. Section IV describes the data on toll rates and driving. Section V estimates the impact of ETC on the elasticity of driving with respect to the toll. Section VI estimates the impact of ETC on toll rates. Section VII considers non-salience-based explanations for these empirical findings. The last section concludes.
972
QUARTERLY JOURNAL OF ECONOMICS
II. EFFECTS OF TAX SALIENCE ON CONSUMERS AND GOVERNMENT: CONCEPTUAL FRAMEWORK In a fully salient tax system, individuals are aware of actual taxes as they make economic and political decisions. In a less salient tax system, individuals are not aware of the actual tax (τ ), but instead have a perception of the tax, which I denote by τ˜ . Recent empirical evidence is consistent with individuals misperceiving taxes (Liebman and Zeckhauser 2004; Feldman and Katascak 2005; Chetty, Kroft, and Looney forthcoming) and with the salience of the tax affecting the extent of this misperception (Chetty, Kroft, and Looney forthcoming). This paper focuses on the response of tax rates to tax salience. However, because an input into this response is how consumers’ economic behavior is affected by tax salience, I begin—in both the conceptual framework and the subsequent empirical work—by analyzing the consumers’ response; I then turn to the government’s response. I denote by θ ≥ 0 the (lack of) salience of the tax system. A higher θ corresponds to a less salient tax system; θ = 0 corresponds to a fully salient system. In the empirical application I will examine the move from manual (i.e., cash) toll collection to electronic toll collection (ETC) and interpret this as a move to a less salient tax system (i.e., an increase in θ ); I present survey evidence in Section III that is consistent with the assumption that ETC reduces the salience of tolls. There are two types of tax salience that may affect tax setting: tax salience at the time of the consumption decision for the taxed good, and tax salience at the time of voting. These need not be the same. To capture this, I denote the perceived tax by τ˜ j , where j = {c, v} indicates perceived taxes at the time of consumption and of voting, respectively. For simplicity I assume the perceived tax is a linear function of the actual tax, (1)
τ˜ j (θ ) ≡ δ0 j (θ ) + δ1 j (θ )τ,
and normalize a fully salient system as one in which the perceived and actual tax are the same (i.e., δ0 j (0) = 0 and δ1 j (0) = 1). I assume that δ1 j (θ ) > 0 (i.e., the perceived tax is increasing in the actual tax). I also assume that in a less salient tax system, the link between the perceived and the actual tax is weaker (i.e., δ1 j (θ ) < 0). The effect of the tax salience on the perceived toll level
E-ZTAX: TAX SALIENCE AND TAX RATES
973
is, however, a priori ambiguous; in other words, δ0 j (θ ) can be either sign. For simplicity, I consider only cases of positive taxation (τ > 0), and further assume that τ˜ j > 0. II.A. Response of Consumer Economic Behavior to Tax Salience The individual chooses consumption of the taxed good based on the perceived tax at the time of the consumption decision, τ˜C (θ ). To simplify the analysis, I assume the individual maximizes a utility function that is quasi-linear in the taxed good and exhibits constant elasticity of demand.1 The individual thus solves (2)
( γ1 +1)
max γ0 x1 1 x1
+ x2 subject to x2 + ( p + τ˜C (θ ))x1 ≤ m,
where x1 denotes the taxed good (with producer price p), x2 denotes all other goods (whose price has been normalized to 1), and m is consumer income. I denote by η(τ˜C ) ≡ γ1 the (constant) elasticity of demand for x1 , which I assume is negative. Note that η(τ˜C ) is the elasticity of demand with respect to the perceived price p + τ˜C (θ ); I denote by η(τ ) the elasticity of demand with respect to the actual price p + τ . To see how consumer responsiveness to the tax changes with the salience of the tax, I will estimate empirically how the elasticity of demand with respect to the actual price (η(τ )) varies with the tax salience (θ ). The sign of this relationship (i.e., the sign of ∂η(τ )/∂θ ) is ambiguous. To see this, note that the relationship between η(τ ) (which I will estimate empirically) and η(τ˜C ) (which I have assumed is constant) can be derived as follows: η(τ ) ≡ (3)
∂( p + τ˜C ) ( p + τ ) p + τ˜C ∂ x1 ( p + τ ) ∂ x1 = ∂( p + τ ) x1 ∂( p + τ˜C ) ∂( p + τ ) x1 p + τ˜C p + τ ∂( p + τ˜C ) . = η(τ˜C ) p + τ˜C ∂( p + τ )
Under the assumption of fixed producer prices (i.e., p does not vary with either τ or θ ), the relationship between the perceived tax and actual tax in equation (1) implies that (4)
∂( p + τ˜ ) ∂ τ˜ = = δ1c (θ ). ∂( p + τ ) δτ
1. The assumption of quasi-linear utility seems a reasonable one when the taxed good is a small part of the overall consumer’s budget (such as the toll case I consider). It is not, however, an innocuous assumption for the political response to tax salience; I discuss this in more detail in Section II.B.
974
QUARTERLY JOURNAL OF ECONOMICS
Using (4), we can simplify the relationship between η(τ ) and η(τ˜C ) in (3) to p+τ δ1C (θ ). (5) η(τ ) = η(τ˜C ) p + τ˜C Differentiating both sides of (5) with respect to salience (θ ) gives ∂η(τ ) = η(τ˜C )( p + τ ) ∂θ − ⎛ (6)
⎞
⎜ −1 ⎟ 1 ×⎜ δ1C (θ )τ δ1C (θ ) + δ1C (θ )⎟ . ⎝ ( p + τ˜C )2 δ0C (θ ) + ( p + τ˜C ) ⎠ − + ? −
+
Equation (6) shows that the sign of the impact of tax salience on the elasticity of demand (i.e., the sign of ∂η(τ )/∂θ) is ambiguous, because the impact of salience on the level of the perceived tax (θ ) + δ1C (θ )τ )) is of ambiguous sign.2 In the em(i.e., ∂ τ˜C /∂θ ≡ (δ0C pirical work I find evidence that consumption behavior becomes less elastic as salience decreases (i.e., ∂η(τ )/∂θ > 0). Equation (6) indicates that a sufficient (although not necessary) condition for (θ ) + δ1C (θ )τ > 0 (i.e., the perceived tax is ∂η(τ )/∂θ > 0 is that δ0C increasing as salience decreases). In Section III I present survey evidence that is consistent with this condition, suggesting that these empirical findings are internally consistent. To estimate ∂η(τ )/∂θ empirically, I multiply (5) through by ∂ log( p + τ ) to obtain p+τ δ1C (θ )∂ log( p + τ ). (7) ∂ log x1 = η(τ˜C ) p + τ˜C Taking a linear approximation to (7) around θ = 0 and explicitly separating out the main effects from the interaction effect of interest, I estimate (8)
log(x1 ) = β1 log( p + τ ) + β2 θ + β3 θ log( p + τ ) + ε.
The parameter β1 provides an estimate of the estimated elasticity of demand in a fully salient system (i.e., θ = 0), in which case 2. The other components of (6) are signed by the assumptions discussed earlier in this section.
E-ZTAX: TAX SALIENCE AND TAX RATES
975
η(τ˜C ) = η(τ ) = β1 . The parameter of interest is β3 ; it indicates how the elasticity changes with salience. II.B. Political Response to Tax Salience The political response of tax rates to tax salience may depend not only on how the consumer’s behavioral responsiveness to tax changes with salience (i.e., ∂η(τ )/∂θ) but also on how the political costs of taxes change with tax salience. Section II.A showed that the sign of the effect of tax salience on the consumer’s behavioral responsiveness is ambiguous. Moreover, any effect of tax salience on political costs need not be the same sign as any effect of tax salience on consumer behavioral responsiveness, because salience at the time of consumption and salience at the time of voting may be different; this creates further ambiguity in the sign of the relationship between tax salience and tax rates. This ambiguity motivates the empirical work that is the focus of this paper. To gain some intuition into the determinants of the sign of the relationship between tax salience and tax rates, I consider a government that sets the tax to maximize a weighted sum of some economic objective and the (negative of) any political costs of the tax. For concreteness, I assume the economic objective of the tax is to raise revenue. I discuss other possible economic objectives—and how these affect the implications of tax salience—in Section II.D. The government chooses τ each year to maximize (9)
max λτ Q( p + τ˜C ) − (1 − λ) f (E) C(τ˜v ), τ
where 0 ≤ λ ≤ 1 represents the weight the government places on the economic objective of the tax (i.e., raising revenue) relative to the political cost of the tax, C denotes the political cost of the tax, and E is an indicator variable for whether or not it is an election year. I assume that f (E) > 0 and f (E) > 0; in other words, the political costs of taxes are exogenously higher in election years, so that we expect a “political business cycle” in taxes (Nordhaus 1975); in the empirical work, I provide evidence of a political business cycle in toll setting. The government’s optimization problem yields the first-order condition for the tax rate (10)
τ∗ =
−Q(τ˜C ) (1 − λ) f (E) C (τ˜V ) + , Q (τ˜C ) λQ (τ˜C )
976
QUARTERLY JOURNAL OF ECONOMICS
where to simplify notation I have defined C ≡ (∂C/∂ τ˜ ) (∂ τ˜ /∂τ ) and Q ≡ (∂ Q/∂ τ˜ ) (∂ τ˜ /∂τ ). To ensure an interior solution to the optimal tax, I assume that C > 0 (i.e., political costs are rising in the actual tax) and Q < 0 (i.e., demand is falling in the actual tax). Note that both consumption salience and voting salience affect the choice of tax rate: the amount of revenue raised depends on the perceived tax at the time of the consumption decision (i.e., τ˜C ), and the political cost of the tax depends on the perceived tax at the time of voting (i.e., τ˜v ). Differentiation with respect to θ of the first-order condition for the government’s optimal tax level in (10) indicates that the sign of any effect of tax salience on the choice of tax rate is a priori ambiguous: ⎛ ⎞ ⎛ ∂C Q ⎜ ∂ Q ⎞⎟ ⎜∂ − Q C ⎟ − ∗ ⎜ (1 − λ) f (E) ⎜ ∂θ ∂τ Q ⎟⎟ ∂θ =⎜ + (11) ⎝ ⎠⎟ ⎜ ⎟. 2 ∂θ λ (Q ) ⎜ ∂θ ⎟ ⎝ ⎠ + ? ?
Although the sign of (11) is theoretically ambiguous, there are intuitive findings concerning how the relationship between tax salience and tax rates is likely affected by the effect of salience on the consumer’s behavioral responsiveness to taxes, and by the effect of salience on the political costs of taxes. To see this, consider first the simplest case in which λ = 1, so that the government only maximizes revenue. In that case, the politically optimal tax in equation (10) reduces to the standard inverse elasticity optimal tax equation τ∗ 1 , = p + τ∗ η(τ )
(12)
and thus (under the assumption of fixed producer prices) (13)
sign of
1 ∂η(τ˜C ) ∂τ ∗ = sign of . ∂θ η(τ )2 ∂θ +
?
Equation (13) indicates that, when the government sets taxes to maximize revenue, the sign of how taxes vary with salience is the sign of how the elasticity of demand with respect to the tax varies with salience (which as we saw in (6) can be of either sign). Intuitively, if a decline in salience lowers the behavioral response
E-ZTAX: TAX SALIENCE AND TAX RATES
977
to the tax (i.e., ∂η(τ )/∂θ > 0), then the tax rate set by the government will be rising as salience declines. Note that the assumption of quasi-linear utility is important for this result, as it removes any distortionary effect of reduced salience on consumption of the taxed good that arises from the budgetary consequences of the misperceived tax. In the more general case, where such distortionary effects will exist, Chetty, Kroft, and Looney (forthcoming) show that even if reduced salience reduces the behavioral response to the tax, this is not sufficient for the optimal tax to increase; this is likely to be particularly important for taxes that are a large share of the individual’s budget, such as income taxes. Moreover, if the government puts some weight on the political costs of taxes (i.e., λ < 1), this introduces another source of indeterminacy in the sign of the relationship between tax salience and tax rates. However, the model suggests that we can learn more about the likely sign of ∂τ ∗ /δθ in (11) by examining how any political business cycle in tax setting changes as tax salience declines. To see this, note that ⎛ ⎞ ∂C Q − ∂ Q C 2 ∗ 1 − λ f (E) ⎜ ∂θ ∂ τ ⎟ ∂θ = (14) ⎝ ⎠ ∂θ ∂ E λ (Q )2 + ?
and observe that the first term in parentheses is positive by assumption, and that the second term in parentheses (whose sign is unknown) also appears in (11). Thus if ∂ 2 τ /∂θ ∂ E > 0, this implies that the second term in parentheses in (14) is positive, so that the entire second term in (11) is positive. In other words, if the political business cycle attenuates as salience declines (i.e., ∂ 2 τ /∂θ ∂ E > 0, for which I find evidence in the empirical work below), this makes it more likely that a decline in tax salience raises taxes (i.e., ∂τ ∗ /δθ > 0). To investigate the relationship between tax salience and tax rates empirically, I note that the first-order condition for the tax rate in (10) indicates that the tax rate will depend on tax salience (θ ), whether it is an election year (i.e., E = 1 or E = 0), and the interaction of these two effects. Because of the serial correlation properties of taxes in my empirical application (which I discuss in more detail below), I estimate the relationship between taxes and salience in first differences, estimating that (15)
τ = β1 θ + β2 E + β3 E(θ ) + μ.
978
QUARTERLY JOURNAL OF ECONOMICS
Estimation of (15) allows a comparison of the effect of tax salience on tax rates in nonelection years (i.e., β1 ) and in election years (i.e., β3 ). II.C. Identification An examination of the two main estimating equations— equation (8), which comes from the driver optimization problem and reveals how behavioral responsiveness to the tax changes with salience, and equation (15), which comes from the political optimization problem and reveals how the tax varies with salience—highlights two important identification problems. First, taxes are taken as exogenous to demand in the demand estimation equation (8), but are determined as the endogenous result of the political optimization problem (see (10)). Identification of the demand equation requires that the error term ε in the demand equation (8) be uncorrelated with the error term μ in the taxsetting equation (15); in other words, identification requires that changes in demand do not contemporaneously affect changes in taxes. For example, if demand follows a random walk, then as long as the government tax-setting process takes at least one year to respond to demand, current changes in taxes will be uncorrelated with current changes in demand and the demand equation (8) will be identified.3 This identifying assumption seems reasonable for a (bureaucratic) government that may not be able to make and implement decisions quickly. In the empirical application, I will show that, in practice, taxes are changed only about once a decade, which is consistent with the assumption of a lagged response. Furthermore, any changes in taxes that are driven by changes in any of the nondemand factors that (10) indicates affect tax rates—that is, the sensitivity of political costs to the tax rate (C ), the electoral calendar (E), or the relative weight (λ) that the government places on the political costs of taxes—do not pose a problem for identification (as long as changes in these factors are themselves exogenous to changes in current demand). The second identification problem is that I allow the tax (τ ) to be chosen endogenously by the political optimization problem in (9), but assume that the salience of the tax system (θ ) is 3. In my empirical application I find that changes in (residual) demand have an AR1 coefficient of 0.045, suggesting that demand is (close to) a random walk. I also explore robustness of demand estimation to alternative specifications with weaker identifying assumptions (see Section V).
E-ZTAX: TAX SALIENCE AND TAX RATES
979
exogenously determined. If the government endogenously chooses θ (e.g., on the basis of any of the factors that determine τ ), the taxsetting estimating equation (15) is not identified. The validity of the assumption that the choice of tax salience is exogenous with respect to the choice of tax rate is ultimately an empirical question, and one that I explore in depth in Section VI.A. II.D. Other Government Objective Functions and Normative Implications For concreteness, in Section II.B I assumed the government’s objective function in choosing the tax rate was a weighted average of the revenue raised by the tax (its economic objective) and the (negative of) the political costs of the tax (its political objective). Of course, the government may well have other economic objectives, such as redistributive taxes or Pigouvian corrective taxes; the latter is potentially quite relevant for the toll case that is the subject of the empirical work. As with a revenue-raising tax, the optimal level of these other types of taxes also varies inversely with the behavioral responsiveness to the tax. For example, if the tax is set as an optimal Pigouvian externality correction, the optimal tax will be increasing as the behavioral responsiveness to the tax declines. Therefore the same empirical prediction concerning how the impact of salience on the behavioral responsiveness to the tax likely affects the impact of tax salience on tax rates should apply (qualitatively) to these other economic objectives. In contrast to the positive empirical predictions, the normative implications of any effect of tax salience on tax rates will be quite sensitive to the government’s objective function. One critical issue for the normative implications of tax salience is whether the government operates as a benign social planner or is (partially or fully) maximizing independent objectives (such as keeping politicians in office or increasing the size of government); in the latter case, the government’s response to a decline in salience may be self-serving, but not socially optimal. The evidence I present below that the political business cycle in toll setting attenuates when salience is reduced suggests that part of the impact of tax salience on tax rates comes from reducing the political costs of raising tolls; this suggests that the government’s response to a reduction in tax salience may not be that of a fully benign social planner. Even when the government operates as a fully benign social planner, the normative implications of a decline in salience will also depend on the economic component of the government’s
980
QUARTERLY JOURNAL OF ECONOMICS
objective function. If the economic objective is to raise revenue, then if salience reduces the behavioral responsiveness to the tax, this is likely to be welfare-improving because it allows the government to raise a given amount of revenue at lower distortionary costs. However, if the economic objective of the tax is a Pigouvian externality correction, the normative implications may be quite different. For example, if salience reduces the behavioral responsiveness to the tax, this has no effect on welfare if the tax is set solely as a Pigouvian corrective tax, utility is quasi-linear in the taxed good, and the revenue raised is rebated back to consumers as a lump sum; the government would raise the tax to the (new) higher optimal externality-correction tax and rebate back the resulting (higher) revenue as a lump sum, with no change in aggregate welfare. However, in more general models in which utility is not quasi-linear and/or the government does not rebate back the revenue raised as a lump sum, a lower behavioral responsiveness to the Pigouvian tax due to reduced salience can be welfare-reducing. III. IMPACT OF ETC ON TOLL SALIENCE: SURVEY EVIDENCE The empirical analysis is predicated on the assumption that ETC reduces the salience of the tolls (i.e., increases θ ). I therefore begin by presenting survey evidence consistent with this assumption. Evidence from two separate surveys indicates that individuals are substantially less aware of tolls if they pay them electronically rather than with cash. One survey is an in-person survey that I designed and conducted in May 2007 of 214 individuals who had driven to an antiques show in western Massachusetts on the Massachusetts Turnpike (“MA Survey”). The other is a telephone survey conducted in June and July 2004 of 362 regular users from New Jersey of any of the six bridges or tunnels of the Port Authority of New York and New Jersey that cross the Hudson River (“NYNJ Survey”). More details on the MA Survey can be found in the Online Appendix (Section A); more details on the NYNJ Survey can be found in Holguin-Veras, Kaan, and de Cerrano (2005, especially pp. 116–126 and pp. 383–394). Each survey asked drivers their estimate of the toll paid on their most recent trip on the relevant facility, their method of payment, and a variety of demographic characteristics; information about the exact trip was also collected so that the actual toll paid could be calculated.
E-ZTAX: TAX SALIENCE AND TAX RATES
981
Table I summarizes the results. Both surveys show a strikingly lower awareness of tolls among drivers who paid with ETC than among those who paid with cash. The differences are both economically and statistically significant. In the MA survey, 62% of drivers who paid using ETC responded to the question about their best guess of the toll they paid that day on the Turnpike with “I don’t know” and would not offer a guess without prompting from the surveyor to please “just make your best guess”;4 in contrast, only 2% of drivers who paid with cash had to be prompted to offer a guess. In the NYNJ survey, 38.1% of ETC users reported “do not know” or “refused” when asked how much they paid at the toll in their most recent drive across the Hudson from New Jersey to New York, compared to 20.0% of cash users.5 Moreover, the ETC drivers’ belief that they did not know how much they had paid for the toll was borne out by their subsequent guesses. In the MA Survey, 85% of drivers who paid using ETC estimated the toll they paid incorrectly, compared to only 31% of drivers who paid using cash. In the NYNJ survey, 83% of ETC drivers estimated the toll incorrectly, compared to only 40% of cash drivers. Conditional on making an error, the magnitude of the error was also larger for ETC users; ETC users overestimate tolls by more than cash users.6 These findings of markedly lower knowledge of tolls among people who paid electronically than among those who paid with cash are consistent with the maintained assumption that tolls are less salient under ETC. In other words, the results are consistent with ETC reducing the link between the actual and the perceived toll (i.e., δ1 j (θ ) < 0). These findings are also consistent with other work on “payment decoupling,” which finds that technologies such as credit cards, which decouple the purchase from the payment, reduce awareness of the amount spent and thereby encourage more spending (e.g., Thaler [1999]; Soman [2001]). 4. Indeed, many of the ETC drivers literally responded, “I don’t know, I used EZ-Pass [or Fast Lane].” 5. It is interesting that the discrepancy in toll awareness between ETC and cash drivers is larger in the MA survey. One possible explanation is that the NYNJ Survey asked about the toll paid on a regular commute, whereas the MA Survey asked about the toll paid on a presumably idiosyncratic trip. Differences in the survey method (e.g., telephone vs. in person) may also have an effect on the individual’s willingness to guess. 6. This finding that ETC is associated with overestimation of the toll is consistent with the finding in Section V that ETC is also associated with reduced behavioral responsiveness to the toll. See equation (6) in Section II.A and the discussion that follows it.
0.618 (0.490) 0.851 (0.359) $1.334 (1.850) 68
0.021 (0.142) 0.308 (0.463) $0.162 (0.828) 146
Cash drivers (2)
Covariate adjusted (4) 0.579∗∗∗ (0.060) 0.512∗∗∗ (0.067) $1.01∗∗∗ (0.303)
No covariates (3) 0.597∗∗∗ (0.060) 0.543∗∗∗ (0.058) $1.172∗∗∗ (0.275) 271
0.381 (0.486) 0.826 (0.379) $0.40
ETC drivers (5)
91
0.200 (0.400) 0.395 (0.489) −$0.10
Cash drivers (6)
0.18∗∗∗ (0.05) 0.43∗∗∗ (0.06) $0.50
Difference between ETC and cash drivers (no covariates) (7)
NYNJ survey
Notes. In columns (1), (2), (5), and (6), standard deviations are in parentheses; in columns (3), (4), and (7) robust standard errors are in parentheses and ∗∗∗ , ∗∗ , ∗ denote statistical significance at the 1%, 5%, and 10% levels, respectively. “Error” in the third row is computed as estimated toll − actual toll paid. In the MA Survey an estimate of the toll paid was eventually elicited from all but one of the respondents; however, in the NJNY Survey, an estimate of the toll paid was only elicited for those who did not respond “don’t know” or “refused.” Thus, for the MA Survey, the sample in rows (2) and (3) includes all but one of the respondents in row (1), but for the NYNJ Survey, the sample in rows (2) and (3) includes only those respondents who did not report “don’t know” in row (1). For the NYNJ survey, the cash toll was $6.00, whereas the ETC toll was $5.00 on peak and $4.00 off peak. For the MA survey, the toll depended on the entrance and exit taken. The average toll paid was about $1.15. Less than 10% of drivers in the MA survey sample drove on a portion of the Turnpike in which there are ETC discounts, and the results are not affected by omitting these drivers from the analysis. In column (4), covariates consist of age, age squared, median household income of ZIP code, dealer retail price for the driver’s car (based on information from www.edmunds.com as of October 2007), and indicator variables for sex, whether the driver regularly pays a toll on a commute to work, and highest level of education reached (high school degree or less, college degree, or postcollege degree, where “college degree” includes associates degrees, which were 10% of the college degree sample). Only published summary statistics (as opposed to the underlying microdata) are available for the NYNJ survey, so that the covariate-adjusted difference in means cannot be computed. In addition, the sample sizes by cell for the NYNJ survey had to be approximated based on information in the text on the total sample size (362) and the fraction of drivers that pay by ETC (74.8%). As a result, the standard errors for the NYNJ Survey are also approximated; approximated numbers are shown in italics. I calculated standard deviations for the binary response variables in the NYNJ Survey, but there was not sufficient information available to calculate the standard deviation for the mean error (or the standard error of the difference in mean error).
Fraction who incorrectly estimate toll Mean error, conditional on misreporting N
Fraction who report “don’t know”
ETC drivers (1)
Difference between ETC and cash drivers
MA survey
TABLE I SURVEY EVIDENCE ON DRIVER AWARENESS OF TOLLS, BY PAYMENT METHOD
982 QUARTERLY JOURNAL OF ECONOMICS
E-ZTAX: TAX SALIENCE AND TAX RATES
983
Several caveats are in order. First, neither survey is representative of the nationwide population. Nonetheless, it is reassuring that the finding of lower toll awareness among ETC drivers persists in two very different populations, including a population of regular commuters. Second, cross-sectional differences in awareness of tolls between ETC drivers and cash drivers could reflect differences in these drivers besides their payment method. Reassuringly, a comparison of the results in columns (3) and (4) of Table I shows that none of the differences in toll awareness in the MA Survey are sensitive (in either magnitude or statistical significance) to adding controls for demographic characteristics of drivers, including age, sex, education, median household income of ZIP code, and value of their car. Finally, a survey response on toll perception does not necessarily reflect either the perceived toll at the time of consumption (τ˜C ) or the perceived toll at the time of voting (τ˜V ). However, given the large percentage of cash drivers relative to ETC drivers who are spot on in estimating the toll paid correctly, it seems plausible that ETC may reduce one or both of these types of salience. I now turn to direct evidence of the impact of ETC first on consumer behavior and then on toll setting.
IV. DATA AND DESCRIPTIVE STATISTICS This section provides some brief background on the sample construction and variable definitions for the toll facility data; considerably more details on the facilities in the sample and the variable definitions can be found in the Online Appendix (Section B) or in the working paper version of this paper (Finkelstein 2007). IV.A. Sample Construction The target sample was all 183 publicly owned toll facilities in the United States (excluding ferries) that were charging tolls in 1985, which predates the introduction of ETC in the United States. In 1985, toll revenue in states that levied tolls was about 0.8% of state and local tax revenue, roughly the same revenue share as state lotteries (U.S. Census Bureau 1985; U.S. Department of Transportation 1985, 1986; Kearney 2005). Statutory authority for toll setting is usually vested in toll operating authorities. These are typically appointed by state or local governments, which therefore, in practice, retain influence on toll setting.
984
0
5
Frequency
10
15
QUARTERLY JOURNAL OF ECONOMICS
1985
1990
1995 ETC start date
2000
2005
FIGURE I Distribution of ETC Start Dates
By contacting each toll authority, I was able to collect data for 123 toll facilities.7 These 123 facilities are run by 49 different operating authorities in 24 different statelike entities; these include 22 states and 2 joint ventures (one between New York and New Jersey and one between New Jersey and Pensylvania). I refer to all 24 hereafter as “states.” On average, the data contain 50 years of toll rates per facility. IV.B. Key Variables ETC Adoption and Diffusion. Figure I shows a histogram of ETC adoption dates, which range from 1987 through 2005, with a median of 1999. By 2005, 87 of the 123 facilities had adopted ETC. Almost all of the variation in whether and when ETC is adopted is between rather than within operating authorities; there is, however, substantial variation across authorities within a state (not shown). On average for a facility with ETC, I observe about six years of ETC. Table II shows that relationship between facility characteristics and ETC adoption. ETC adoption rates are highest in the northeast (78%) and lowest in the west (57%). The high adoption rates in the northeast may reflect greater urbanism (because ETC 7. A toll “facility” is a particular road, bridge, or tunnel; about 60 percent of the responding facilities are bridges or tunnels.
985
E-ZTAX: TAX SALIENCE AND TAX RATES TABLE II WHICH FACILITIES ADOPT ETC?
All By facility type Roads Bridges or tunnels By region of country Northeast Midwest South West
Number of facilities
Probability of adopting ETC by 2005
Average adoption date conditional on adoption
123
.71
1998.2
44 79
.70 .71
1996.4 1999.2
58 10 41 14
.78 .60 .68 .57
1998.7 1996.7 1997 2000.9
may help reduce congestion) as well as higher labor costs (because ETC reduces labor costs of toll collection). ETC is adopted with the same probability on roads as on bridges and tunnels; however, roads that adopt ETC do so about three years earlier on average than bridges or tunnels that adopt ETC. Older facilities are more likely to adopt ETC, and those that do are likely to do so earlier than younger facilities that adopt ETC (not shown). Once a facility adopts ETC, use of the technology diffuses gradually across drivers. I was able to obtain the ETC penetration rate (defined consistently within each facility as either the fraction of toll transactions or the fraction of toll revenue collected by ETC) for about two-thirds of facility-years with ETC. Figure II shows the within-facility ETC diffusion rate. It takes about fourteen years for ETC to reach its steady state penetration rate of 60 percent. Toll Histories. I define the toll as the nominal toll for passenger cars on a full-length trip on a road, or on a round trip on a bridge or tunnel. I collected data on both the “manual” (i.e., cash) toll and any discount offered for the electronic toll; the electronic toll is never more than the cash toll.8 Over half (53 of 87) of facilities with ETC offer a discount at some point. Discounts are presumably offered to encourage use of the technology; indeed, they are more common on facilities that adopt ETC earlier. The discounts may also be rationalized as a Pigouvian subsidy if ETC has positive externalities on congestion reduction. The average discount offered is about 15 percent. 8. High-frequency discounts (i.e., commuter discounts) are not coded. None of the facilities in the sample offer time-of-day varying prices.
986
QUARTERLY JOURNAL OF ECONOMICS
0.7
ETC penetration rate
0.6
0.5
0.4
0.3
0.2
0.1
0 1
2
3
4
5
6
7
8
9 10 11 ETC year
12
13
14
15
16
17
18
19
FIGURE II Within-Facility ETC Diffusion Figure II reports the coefficients on indicator variables for the number of years a facility has had ETC from the following regression: ETC Penetrationit = αi + 19 k=1 βk 1(ETCyear = k), where the αi are facility fixed effects, 1(ETCyear = k) are indicator variables for whether it is the kth year of ETC, and ETC Penetration is defined either as percentage of toll transactions paid by ETC or as percentage of revenue paid by ETC, depending on the facility. The regression is estimated on the sample of facility-years with ETC and data on ETC penetration (N = 467; 84 unique facilities).
The primary toll measure in the analysis is the lower envelope of the manual and electronic tolls (hereafter, “minimum toll”). I also present results for the subsample of facilities that never offer ETC discounts, and for which the minimum and manual toll are therefore always the same. On average, the minimum toll increased by 2.0% per year. This is substantially below the facilityyear-weighted average inflation rate of 4.2%. Toll changes are lumpy; on average only 7.7% of facilities increase their minimum toll and only 1% of facilities decrease it each year. Revenue and Traffic Data. I was able to collect traffic (revenue) data for 76 (45) of the 123 facilities. On average, for a facility with these data, I obtained 34 years of data. V. THE IMPACT OF ETC ON THE ELASTICITY OF DRIVING WITH RESPECT TO THE TOLL CHANGE To examine how ETC affects the elasticity of driving with respect to the tax, I adapt the demand equation (8) to the toll
E-ZTAX: TAX SALIENCE AND TAX RATES
987
context as follows: log(traffic)it = γt + β1 log (minimum tollit ) + β2 log (minimum tollit ) ∗ Never ETCi + β3 log (minimum tollit ) ∗ ETC Penetrationit (16)
+ β4 Never ETCi + β5 ETC Penetrationit + εit .
I proxy for demand for the taxed good (i.e., x1 in (8)) with the amount of traffic on facility i in year t (i.e., trafficit ), and for the salience of the tax system (i.e., θ in (8)) with the ETC Penetration rate on facility i in year t (i.e., ETC Penetrationit ). For purposes of practicality, I estimate the demand responsiveness to τ in (16) rather than to p + τ as in (8), because I do not observe the nontax costs ( p) of driving. As long as p does not vary with taxes or with tax salience (i.e., the fixed producer prices assumption discussed in Section II), this modification will affect the magnitude of the estimated elasticities but not their sign. As noted, I use the minimum toll as my measure of τ . Equation (16) examines the relationship between the annual percentage change in a facility’s traffic (log(traffic)it ) and the annual percentage change in its toll (log(minimum toll)it ) and how this relationship changes with the ETC penetration rate. To strengthen the inference, it also allows the elasticity to vary across facilities based on whether the facility ever adopted ETC (Never ETCi is 1 if the facility never adopts ETC and zero otherwise), and it allows for secular changes in demand over time (the γt represent a full set of year fixed effects). The key coefficient of interest is β3 ; this indicates how the elasticity changes at a facility as ETC use diffuses. Finally, εit is a random disturbance term capturing all omitted influences. I allow for an arbitrary variance– covariance matrix within each “state” and give equal weight in the regression to each operating authority. As discussed in Section II.C, identification of (16) is based on the assumption that changes in tolls are not affected by contemporary changes in demand. This is probably a reasonable assumption. Traffic—and presumably underlying demand for driving—changes continuously each year, whereas a facility’s toll is raised on average only every eight to nine years. The infrequency of toll adjustment likely reflects both general lags in price setting by government enterprises and political constraints; for example, I show in Section VI.B that toll increases are significantly lower during state election years. Although tolls may be
988
QUARTERLY JOURNAL OF ECONOMICS
adjusted in part based on past demand shocks (i.e., lagged values of changes in traffic), changes in traffic within a facility show very little serial correlation; a regression of the residuals from (16) on their lags produces a coefficient of only 0.045. Any adjustment of tolls to past changes in demand is therefore unlikely to pose much of a practical problem for the estimation. However, as a robustness check, I also report results in which I limit the sample to the years in which a toll changes or the two years before or after a toll change; I refer to this as the “+2/−2 sample.” The assumption in this more limited sample is that the timing of the toll change is random with respect to short-run traffic changes, although it may reflect longer-run demand changes. I estimate (16) on approximately one-fourth of the facilities in the data. By necessity, the analysis is limited to the approximately 60 percent of facilities for which I obtained traffic data. I further limit the subsample of facilities with traffic data to the approximately 40 percent of them that never offer an ETC discount. This allows me to include the ETC penetration rate directly on the right-hand side, without worrying about omitted variable bias from any potential effect of an ETC discount on both the ETC penetration rate and traffic. An added advantage of looking only at facilities that never offer an ETC discount is that in this sample there is only one toll rate (i.e., the minimum toll and the toll are always the same), which avoids the measurement error that ETC discounts would otherwise introduce in the right-hand-side toll variable once ETC is introduced.9 Table III reports the results. Columns (1) and (2) show the results from regressing log(traffic)it on log(minimum toll)it and year fixed effects. Column (1) shows the results for the full sample of facilities with traffic data, including those that offer ETC discounts. The coefficient on log(minimum toll)it of −0.049 (standard error 0.015) indicates that a 10% increase in tolls is associated with a statistically significant but economically small 0.5% reduction in traffic. Column (2) shows that the result is quite similar for the sample of facilities that never offer ETC discounts; the coefficient on log(minimum toll)it is −0.058 (standard error 9. I show below that the estimated impact of ETC on toll rates is robust to limiting the sample to facilities that never offer discounts. When I limit to those for whom I have traffic data, the effect is very similar in magnitude to the estimates in the full sample, although no longer statistically significant at conventional levels (not shown).
989
E-ZTAX: TAX SALIENCE AND TAX RATES TABLE III THE ELASTICITY OF TRAFFIC WITH RESPECT TO TOLLS (1) log min. tollit
(2)
−0.049 −0.058 (0.015) (0.018) [.004] [.008]
log min. tollit * ETC penetrationit
(3)
(4)
(5)
(6)
−0.061 (0.019) [.009] 0.134 (0.038) [.005]
−0.057 (0.017) [.006]
−0.062 (0.039) [.145] 0.141 (0.076) [.091]
−0.060 (0.037) [.135]
log min. tollit * ETC yearit log min. tollit * never ETCi Mean dep. var. # of states # op. authorities # of facilities N Sample restriction(s)
0.049 21 32 76 2,200
−0.071 (0.136) [.611] 0.042 0.043 12 12 16 16 33 33 727 671 No ETC No ETC discounts discounts
0.006 (0.001) [.002] −0.073 (0.131) [.588] 0.042 12 16 33 727 No ETC discounts
−0.009 (0.209) [.966] 0.040 12 16 33 292 No ETC discounts +2/−2 sample
0.006 (0.003) [.062] −0.006 (0.205) [.976] 0.039 12 16 33 305 No ETC discounts +2/−2 sample
Notes. Table reports results from estimating variants of (16) by OLS. The dependent variable is the change in log traffic. In addition to the covariates reported in the table, all regressions include year fixed effects and a main effect for any variables that are interacted with log(min. toll). The bottom row indicates any sample restrictions. “No ETC discounts” limits facilities to those that never offered an ETC discount. “+2/−2 sample” limits sample to facility-years in which there is a toll change or the two years before or after a facility’s toll change. Never ETCi is an indicator variable for whether facility i never has ETC. ETC penetrationit is the share of tolls paid by ETC on facility i in year t; it is zero in years in which the facility did not have ETC. ETC yearit is the number of years the facility has had ETC; it is zero in any year in which the facility does not have ETC, 1 the year the facility adopts ETC, 2 the second year the facility has ETC, and so forth. Each operating authority receives equal weight. Standard errors (in parentheses) are clustered by state. p-values are reported in square brackets.
= 0.018). These results suggest that tolls are set below the profitmaximizing rate, which is consistent with Peltzman’s (1971) observation that there will be a downward bias in the prices set by government-owned enterprises. More generally, it suggests that— as modeled in Section II.B—the government objective function is not pure revenue maximization.10 10. Of course, I am only measuring the short-run response to a small change in tolls; this behavioral response may merely reflect the route chosen on a particular day. Longer-run responses to (possibly larger) toll changes may be larger, reflecting among other things decisions that affect regular commuting patterns.
990
QUARTERLY JOURNAL OF ECONOMICS
Column (3) shows the results from estimating the complete equation (16). The coefficient on log(minimum tollit ) ∗ ETC penetrationit is 0.134 (standard error 0.038); this indicates that a 5-percentage-point increase in the ETC penetration rate (which is the average increase per year of ETC) is associated with a (statistically significant) 0.0067 decline in the elasticity of driving with respect to the toll, or about 10 percent relative to the average estimated elasticity prior to ETC of −0.061. Column (4) shows the results when the ETC Penetration variable in (16) is replaced by the number of years the facility has had ETC (ETC Year); this variable is zero prior to ETC adoption, 1 in the year of adoption, 2 in the second year of ETC, and so forth. The coefficient on log(minimum tollit ) ∗ ETC Yearit is 0.006 (standard error 0.001), indicating a decline in elasticity of 0.006 per year of ETC quite similar to that estimated in column (3).11 The last two columns of Table III repeat the analysis in columns (3) and (4) on the +2/−2 sample. The point estimates on both the elasticity of driving under manual toll collection and the change in the elasticity associated with ETC Year (or ETC Penetration) remain virtually unchanged. The change in the elasticity associated with ETC remains statistically significant, although at the 10% level in the +2/−2 sample (columns (5) and (6)) rather than at the 1% level as in the larger samples (columns (3) and (4)). As noted in Section II.B, for taxes that are small as a portion of income, if a decline in salience reduces the behavioral responsiveness to the toll, this will tend to cause tolls to rise when salience declines. However, the net impact of salience on toll rates is ambiguous; it also depends on how salience affects the political costs of toll setting. I now turn to an examination first of the net effect of ETC on toll rate and then of the effect of ETC on the political costs of tolls. 11. One potential concern in interpreting these results is that the finding of a decline in the (absolute value) of the elasticity of driving with respect to the toll under ETC might spuriously reflect a general time trend in the elasticity of driving with respect to the toll. To investigate this, I reestimated the regressions shown in columns (3) and (4) of Table III with the inclusion of an additional interaction term log(minimum toll)it * yeart on the right-hand side; this allows for a time trend in the elasticity of driving. The inclusion of this interaction term weakened the precision of the estimated decline (in absolute value) of the driving elasticity under ETC, but did not substantively affect the finding. For example, for the specification shown in column (3), the coefficient on log(minimum tollit ) ∗ ETC penetrationit became 0.137 (standard error 0.067). In column (4), the coefficient on log(minimum tollit ) ∗ ETC Yearit became 0.005 (standard error 0.002).
E-ZTAX: TAX SALIENCE AND TAX RATES
991
VI. THE IMPACT OF ETC ON POLITICAL BEHAVIOR VI.A. The Impact of ETC on Toll Rates Baseline Specification. To estimate the impact of ETC on toll rates, I begin with a simplified version of the estimating equation for tax setting (equation (15)) in which I omit any measure of whether it is an election year from the right-hand side. Because the election calendar is set exogenously, this does not introduce any omitted variable bias, and allows me to capture the average impact of ETC on toll rates; I augment the analysis to include electoral effects in Section VI.B. I therefore begin with the estimating equation: (17)
yit = γt + β1 ETCAdoptit + β2 ETCit + μit .
In the baseline specification, the dependent variable is the change in the log of the minimum toll (log(min toll)it ). I estimate the dependent variable in logs rather than in levels (as in equation (15) in Section II.B) in order not to constrain toll rates in different facilities to grow by the same absolute amount each year; this seems undesirable, given the considerable variation in toll rates across facilities.12 The γt s represent year dummies that control for any common secular changes in toll rates across facilities. The key coefficients of interest are those on ETCAdoptit and ETCit , which represent my parameterization of the change in tax salience (θ in (15)). Specifically, ETCAdoptit is an indicator variable for whether facility i adopted ETC in year t. The coefficient on ETCAdoptit thus measures any level shift in the minimum toll associated with the introduction of ETC; this might include, for example, the effect of any ETC discounts. However, because ETC use among drivers diffuses gradually, it is likely that any impact of ETC on toll rates will also phase in gradually. To capture this, I include the indicator variable ETCit for whether facility i has ETC in year t; it is 1 in the year of ETC adoption and in all subsequent 12. In practice, the sign and statistical significance of the impact of ETC on tolls are robust to specifying the dependent variable as the change in the level of the minimum toll rather than the change in the log of the minimum toll; the magnitude of the effect is slightly more than double in this alternative specification (not shown). One potential concern with the log specification is that the dependent variable is censored when a toll is set to 0. Indeed, 15 of the 123 facilities that were charging a toll in 1985 subsequently set the toll to zero. I treat all facilityyears with zero tolls as censored (both in the log and in the level analysis). This likely biases downward any estimated impact of ETC, because I find that ETC is associated with a negative and marginally statistically significant decline in the probability that the toll rate is changed from nonzero to zero (not shown).
992
QUARTERLY JOURNAL OF ECONOMICS
years. The coefficient on ETCit thus measures the average annual growth in a facility’s toll once it has ETC. Thus I parameterize θ with ETCAdoptit and ETCit in the first year of ETC, and I parameterize θ with ETCit in all subsequent years with ETC. Finally, μit is a random disturbance term capturing all omitted influences.13 I estimate (17), allowing for an arbitrary variance–covariance matrix within each state, and give equal weight in the regression to each operating authority. The first column of Table IV shows the results from estimating (17). The coefficient on ETCit is 0.015 (standard error 0.006). This indicates that once a facility has ETC, its toll increases by 1.5 percentage points more per year than it otherwise would have. This effect is both statistically and economically significant. Relative to the average annual 2% increase in tolls, it implies that after installation of ETC, the facility’s toll rate rises by 75% more per year than it did prior to ETC.14 The toll change in the first year of ETC is given by the sum of the coefficients on ETCAdoptit and ETCit . These indicate that there is a (statistically insignificant) 3.6% decline in tolls the year that ETC is adopted. The results in the next two columns suggest that this decline in the year of ETC adoption is due to ETC discounts. Column (2) shows the results when the dependent variable is the change in the log manual toll; column (3) shows the results when the sample is limited to the approximately 60 percent of facilities that never offered an ETC discount (half of which never adopted ETC), for which the manual and minimum toll are always the same. In these alternative specifications, the sum of the coefficients on ETCAdoptit and ETCit is either positive and insignificant (column (2)) or negative and now both economically and statistically insignificant (column (3)). The fact that the growth in tolls under ETC persists in the “no discount” sample (column (3))—the coefficient on ETCit is statistically significant and slightly larger in magnitude than in the full sample in column (1)—indicates that the estimated growth 13. I estimate (17) in first differences rather than in levels with facility fixed effects because the residuals are much less highly serially correlated in first differences (AR1 coefficient of −0.045) than in the fixed effects version (AR1 coefficient of 0.92), making the first-differenced specification the preferred specification (Wooldridge 2002, pp. 274–281). 14. One might prefer to specify the percentage increase in the toll associated with ETC relative to the average annual growth rate of tolls prior to ETC; this is 1.9%. It is quite similar to the sample average (despite an average annual growth rate of tolls under ETC of 2.8%) because the vast majority of facility-years in the approximately fifty-year toll histories I collected on each facility do not have ETC.
E-ZTAX: TAX SALIENCE AND TAX RATES
993
TABLE IV IMPACT OF ETC ON TOLL RATES log log min. toll manual toll (1) (2) ETCit
0.015 (0.006) [.018]
0.020 (0.006) [.004]
ETC penetrationit −0.051 (0.035) [.158] Mean dep. var. 0.020 # of states 24 # op. authorities 49 # facilities 123 N 5,079 Estimation OLS Sample restriction ETCAdoptit
0.016 (0.032) [.622] 0.022 24 49 123 5,079 OLS
log toll (3)
log toll (4)
log log min. toll min. toll (5) (6)
0.024 (0.012) [.061] 0.623 (0.285) [.044] −0.033 −0.051 (0.019) [0.035] [.097] [.166] 0.017 0.017 17 17 31 31 70 70 2,875 2,751 OLS OLS No ETC No ETC discount discount
0.557 0.501 (0.262) (0.261) [.045] [.067] −0.105 −0.097 (0.109) (0.108) [.348] [.380] 0.020 0.020 24 24 49 49 123 123 4,815 4,815 IV IV
Notes. Table reports results of estimating (17) (columns (1)–(3)) and (19) (columns (4)–(6)). Column headings define the dependent variable; the bottom two rows provide additional information on the estimation technique and sample restriction. ETCAdoptit is an indicator variable for whether facility i adopted ETC in year t. ETCit is an indicator variable for whether the facility has ETC; it is 1 in the year that ETC is adopted and in all subsequent years. ETC penetrationit measures the change in the proportion of tolls on the facility paid by ETC; it is zero if the facility does not have ETC. In column (5), the instrument for ETC penetrationit is ETCit . In column (6), the instrument for ETC penetrationit is a cubic polynomial in the number of years the facility has had ETC. In addition to the covariates shown in the table, all regressions include year fixed effects. Each operating authority receives equal weight. Standard errors (in parentheses) are clustered by state. p-values are reported in square brackets. “No ETC discounts” limits facilities to those that never offered an ETC discount. Declines in sample size in column (4) (compared to column (3)) and in column (5) or (6) (compared to column (1)) reflect missing data on ETC penetration rates (see Section IV).
in tolls after ETC is installed does not merely reflect a recouping of first-year losses from the ETC discount. For facilities that offer ETC discounts, there does not appear to be any systematic change in the discount over time after ETC adoption (not shown). This suggests that in practice increases in the minimum toll reflect a shift of the entire toll schedule, which is consistent with the finding that the manual toll also increases under ETC (column (2)).15 15. Although it might at first appear puzzling that the manual (i.e., cash) toll—which has become no less salient—also increases under ETC, this is easily understood by the necessary linkage between cash and electronic toll rates; were the electronic rate to increase while the cash rate did not, this would presumably discourage use of ETC. The preservation of the ETC discounts once ETC is installed
994
QUARTERLY JOURNAL OF ECONOMICS
The Pattern of ETC Diffusion and Toll Increases. The preceding analysis constrains the effect of ETC to be the same across facilities and over time. However, if ETC increases tolls by reducing their salience, we would expect the effect to be increasing in the ETC penetration rate, whose diffusion rate is not constant over time (see Figure II) or across facilities (not shown). As a stronger test of the salience hypothesis, therefore, I examine how the time pattern of toll changes after ETC adoption compares to the time pattern of ETC diffusion. Specifically, I compare the coefficients from estimating
(18a)
log(min toll)it = γt +
k=9
βk1 ETCYear(k,k+1) + εit
k=−9
and (18b) ETC Penetrationit = γt +
k=9
βk1 ETCYear(k,k+1) + εit ,
k=1
where ETC Penetrationit is the percentage point change in the ETC penetration rate for facility i in year t. The key outcome of interest is a comparison of the time pattern of the coefficients on the indicator variables 1(ETCYear(k,k+1) ) across the two equations. These are indicator variables for whether it is k or k + 1 years since ETC was adopted on the facility. For example, 1(ETCYear(1,2) ) is an indicator variable for whether ETC was adopted this year or last year (i.e., ETC Year is 1 or 2). In (18a), all of the indicator variables represent a two-year interval, except for the first (respectively, last) indicator variable, which is a “catch-all” variable for whether it is 9 or more years before (respectively, after) ETC adoption; the omitted category is the two years prior to adoption (i.e., ETC Year of −1 or −2). In (18b) I include only the post-ETC dummies that are in (18a). Figure IIIA shows the result. The solid black line shows the pattern of the log toll with respect to ETC Year implied by the estimates from (18a) and the dark dashed line shows the corresponding time pattern of ETC diffusion implied by the estimates
likely reflects continued attempts to induce more drivers to switch to ETC; the maximum ETC penetration rate in my sample is only 78%.
995
E-ZTAX: TAX SALIENCE AND TAX RATES 0.15
0.55
(A)
ETC penetration rate (right axis)
0.1
0.45
0.05
0.35
0
0.25
–0.05
0.15
Log minimum toll (left axis)
–0.1
0.05
–0.15
–0.05 –8
–6
–4
–2
2
4
6
8
ETC year 0.2
0.55
(B)
0.15
0.45
ETC penetration rate (right axis)
0.1
0.35
0.05 0.25 0
0.15 –0.05
Log minimum toll (left axis)
–0.1
0.05
–0.15
–0.05 –8
–6
–4
–2
2
4
6
8
ETC year
FIGURE III Time Pattern of Toll Changes and ETC Diffusion The solid black line shows the pattern of log minimum toll implied by the estimates from (18a); the light dashed lines show the corresponding 95% confidence interval. The dark dashed line shows the pattern of the ETC penetration rate implied by estimating (18b). ETC year represents the number of years since (or before) ETC adoption. The omitted category (ETC year −2 for (18a) and all years prior to ETC adoption for (18b)) is set to zero. Indicator variables for whether it is nine or more years after ETC adoption are included in the estimating equation but not graphed; in (4a) an indicator variable for whether it is nine or more years before ETC adoption is also included in the regression but not graphed. In Panel B the sample of ETC-adopting facilities is limited to those who adopted in 1998 or earlier. The upper end of the 95% confidence interval for the log minimum toll at eight years is not shown for scale reasons; it is 0.201 (full sample, A) and 0.311 (balanced panel, B). To enhance the readability of the graph, the 95% confidence interval on ETC penetration rate is not shown. For Panel A the upper and lower 95% confidence intervals for ETC penetration rate are as follows: (0.16, 0.378) for ETC year 2, (0.267, 0.484) for ETC year 4, (0.336, 0.565) for ETC year 6, and (0.378, 0.610) for ETC year 8. For Panel B, the analogous confidence intervals are (0.197, 0.283), (0.333, 0.425), (0.389, 0.550), and (0.419, 0.617).
996
QUARTERLY JOURNAL OF ECONOMICS
of (18b).16 The results indicate that, after remaining roughly constant in the pre-ETC period, toll rates decline in the first two years of ETC (reflecting the discounts discussed earlier) and then climb steadily as ETC diffuses across the facility. Of course, the wide confidence intervals on the estimates caution against placing too much weight on the estimated time path. It is nonetheless reassuring that the point estimates suggest that the pattern of toll increases is similar to that of ETC diffusion. A potential concern with this analysis is that the set of facilities that identify the different β ks varies with the ETC year k. It is therefore difficult to distinguish the time path of the effect of ETC on a given facility from potentially heterogeneous effects of ETC across facilities.17 Figure IIIB therefore shows the results from re-estimating (18a) and (18b) when the sample of ETC-adopting facilities is limited to those that adopted ETC in 1998 or earlier. In this balanced panel of facilities, all of the graphed coefficients are identified by a constant set of facilities. The results are quite similar.18 For a more parametric (and higher-powered) analysis of how the time pattern of toll changes after ETC adoption compares with the diffusion of ETC, I estimate a modified version of (17): log(min toll)it = γt + β1 ETCAdoptit + β2 ETC Penetrationit + εit . (19) By replacing the indicator variable for whether the facility has ETC (ETCit ) with the percentage point change in ETC penetration (ETC Penetrationit ), I now allow the effect of ETC to vary over time and across facilities as a function of the diffusion of ETC.19 As discussed, I must estimate equation (19) on 16. The scale of the graph is arbitrary. I set the omitted category to zero. Thus, for example, the log minimum toll in ETC Year 4 is 2∗ β1 . +2∗ β3 and the log minimum toll in ETC Year −4 is 2∗ β−4 . 17. For the same reason, I do not extend the dummies in (18a) or (18b) for more years after ETC is adopted. 18. The point estimates in Figure IIIB indicate no preperiod trend in the balanced panel, which is reassuring relative to the (albeit statistically insignificant) suggestive evidence of some downward preperiod trend in the full sample in Figure IIIA. In Table VI I investigate the issue of potential preperiod trends in more detail, using a more parsimonious specification to increase statistical precision. 19. A more stringent test would be to include both ETC Penetrationit and ETCit on the right-hand side to examine whether the diffusion of ETC has an impact on toll rates that can be distinguished from a linear trend. I find that while the two variables are jointly significant, it is not possible to distinguish the effect of ETC penetration separately from a linear trend (not shown). This is not surprising, because, on average, the data contain about six years of data on a
E-ZTAX: TAX SALIENCE AND TAX RATES
997
the subsample of facilities that never offer an ETC discount, as changes in the ETC discount will affect both the diffusion of ETC and the minimum toll. Column (4) of Table IV shows the results. The coefficient on the change in the ETC penetration rate is 0.623 (standard error 0.285). This indicates that every 10-percentagepoint increase in ETC penetration is associated with a (statistically significant) toll increase of 6.2%. For the full sample of facilities, I estimate (19) instrumenting for ETC Penetrationit with the indicator variable ETCit ; this is equivalent to instrumenting for the change in ETC penetration with a linear trend. Column (5) shows these results. The coefficient on ETC Penetrationit is 0.557 (standard error 0.262), indicating that every 10-percentage-point increase in ETC penetration is associated with a (statistically significant) 5.6% increase in the toll. To allow the effect of ETC to vary over time, in column (6) I instead instrument for the change in ETC penetration with a cubic polynomial in the number of years the facility has had ETC. The coefficient on ETC Penetrationit is now 0.501 (standard error 0.261). The results are also similar if I instead instrument for ETC Penetrationit with a series of indicator variables for the number of years under ETC (not shown). The magnitude of the estimated effect of ETC is quite similar across all of the various specifications shown in Table IV. The results from the baseline specification (Table IV, column (1)) suggest that after 14 years, by which point ETC has diffused to its steady state level (see Figure II), ETC is associated with an increase in the toll rate of 17%, or about one-sixth (∼exp(βETCAdopt + 14∗ βETC )). The IV estimates in columns (5) and (6) suggest that once ETC has diffused to its steady state level of 60%, it is associated with increases in tolls of 26 and 23%, respectively (∼exp(βETCAdopt + 0.6∗ βETC Penetration )). When the sample is limited to facilities without ETC discounts, the implied steady state increase in tolls is 36% when (3) is estimated (column (3)) or 38% when (5) is estimated (column (4)). All of these implied steady state toll increases associated with ETC are statistically significant at at least the 10% level. Taken together, these estimates suggest that the diffusion of ETC to its steady state level is associated with a 20 to 40 percent increase in toll rates. Given the extremely inelastic demand for driving with respect to the toll facility with ETC, and the diffusion pattern of ETC is basically linear for those first six years (see Figure II).
998
QUARTERLY JOURNAL OF ECONOMICS
that I estimate below, these results suggest that the associated increase in revenue for the toll authority is also about 20 to 40 percent. Endogeneity of the Timing of ETC Adoption. I have analyzed the endogenous choice of tax rates while assuming that the choice of the salience of the tax system (i.e., the adoption of ETC) is exogenous. In practice, the decision to adopt ETC does not appear to be random. For example, as previously discussed, higher labor costs in the northeast may have encouraged more ETC adoption. This does not, however, pose a problem for the analysis per se, which requires only that the timing of ETC implementation be uncorrelated with changes in a facility’s toll setting relative to its norm. Nonetheless, the correlation of various observable characteristics with whether or when a facility adopts ETC (see Table II) raises concerns about the identifying assumption that absent the introduction of ETC on facility i in year t, toll rates would not have changed differentially for that facility. I therefore analyze the effect of ETC separately on samples stratified by these characteristics. Table V shows the results. Column (1) replicates the baseline specification (Table IV, column (1)). Columns (2) through (7) show the effects separately by geographic region, by facility type (bridges and tunnels vs. roads), and by facility age. Not only does statistical significance generally persist across the subsamples, but also the point estimates are remarkably similar.20 To more directly control for differences across facilities in the underlying rate of toll growth, column (8) shows that the results are robust to the addition of facility fixed effects to (17), which is equivalent to allowing facility-specific linear trends in toll rates. One specific source of omitted variable bias that the preceding analysis does not directly address is that ETC adoption may be a part of a broader infrastructure project, or a signal that infrastructure modernization is in the works. In this case, the relationship between ETC and toll increases may be spurious, as infrastructure projects may necessitate (or provide political cover for) toll increases. To investigate this possibility, I compiled histories of 20. As a distinct exercise, I was also interested in whether the impact of ETC varied between operating authorities that automatically send monthly statements of expenses to users and authorities from which drivers had to actively request (and in some cases pay for) ETC expense statements. The point estimates did not suggest any economically or statistically differential impact of ETC on toll rates along this dimension, although the standard errors were sufficiently large so that it was not possible to rule out fairly large differences (not shown).
0.020 24 49 123 5,079
0.015 (0.006) [.018] −0.051 (0.035) [.158]
0.022 14 28 68 3,008
0.016 (0.010) [.141] −0.048 (0.054) [.399]
0.017 10 21 55 2,071
0.014 (0.005) [.030] −0.044 (0.027) [.137]
South and west (3)
0.021 18 24 44 1,692
0.015 (0.008) [.067] −0.023 (0.063) [.719]
Roads (4)
0.020 16 31 79 3,387
0.028 (0.010) [.015] −0.086 (0.017) [.000]
Bridges and tunnels (5)
0.019 13 20 43 1,389
0.021 (0.010) [.065] 0.051 (0.079) [.534]
Open after 1960 (6)
0.021 21 39 77 3,690
0.013 (0.007) [.079] −0.084 (0.025) [.003]
Open 1960 or before (7)
0.020 24 49 123 5,079
0.013 (0.007) [.072] −0.053 (0.036) [.147]
Facility fixed effects (8)
−0.055 (0.035) [.124] 0.014 (0.007) [.048] 0.017 (0.014) [.221] −0.003 (0.007) [.659] 0.021 23 46 115 4,712
−0.055 (0.035) [.125] 0.014 (0.007) [.048]
0.021 23 46 115 4,712
(10)
(9)
Facilities with infrastructure data
Notes. Table reports results from estimating variants of (17) by OLS. The dependent variable is the change in the log minimum toll. All regressions include year fixed effects (not shown). Each operating authority receives equal weight. ETCAdoptit is an indicator variable for whether facility i adopted ETC in year t. ETCit is an indicator variable for whether the facility has ETC; it is 1 in the year that ETC is adopted and in all subsequent years. Columns (2) and (3) limit the sample to, respectively, facilities in the northeast and midwest, and facilities in the south and west. Columns (4) and (5) limit the sample to, respectively, roads, and bridges or tunnels. Columns (6) and (7) limit the sample to, respectively, facilities that opened after 1960 and facilities that opened in 1960 or earlier. Column (8) adds facility fixed effects to the right-hand side of (17). In columns (9) and (10) the sample is limited to the 115 facilities for which infrastructure data are available. INFRAAdoptit is an indicator variable for whether facility i started a new infrastructure project in year t. INFRAit is an indicator variable for whether facility i has an infrastructure project in progress in year t; it is 1 in the year that the project is started and in all subsequent years that the project is in progress. All estimates give equal weight to each operating authority. Standard errors in parentheses are clustered by state, and p-values are shown in square brackets.
Mean dep. var. # of states # op. authorities # of facilities N
INFRAit
INFRAADOPTit
ETCAdoptit
ETCit
Baseline (1)
Northeast and midwest (2)
TABLE V IMPACT OF ETC ON TOLL RATES: ROBUSTNESS ANALYSIS
E-ZTAX: TAX SALIENCE AND TAX RATES
999
1000
QUARTERLY JOURNAL OF ECONOMICS
infrastructure projects on 115 of the 123 individual toll facilities.21 These histories report the timing of a variety of infrastructure projects including renovations, replacements, repairs, widenings, extensions, and other improvements. I constructed indicator variables for whether facility i started an infrastructure project in year t (INFRAAdoptit ) and whether it had a project either started or ongoing in year t (INFRAit ). On average, a project was started in 2.2% of facility-years, and 10.1% of facility-years had an infrastructure project either starting or ongoing. I reestimate the basic relationship between ETC and toll increases (equation (17)) with these two additional variables included as covariates. Column (9) shows that the baseline results (without the additional infrastructure variables) are unaffected by restricting the sample to the 115 facilities for which I have data on infrastructure projects. Column (10) shows that the estimated increase in tolls associated with ETC is not affected in either magnitude or statistical significance by including the two infrastructure variables as controls. This suggests that the increase in tolls associated with ETC is not likely to be spuriously due to a correlation between ETC and infrastructure projects, which themselves are responsible for toll increases; indeed, the results suggest that infrastructure projects are not, in fact, associated with toll increases. There are of course many reasons, besides infrastructure projects, that the timing of ETC adoption might be spuriously correlated with toll increases. For example, facilities may respond to increased congestion by both adopting ETC and by raising tolls as complementary congestion-reducing strategies. This suggests we should observe increases in congestion (or a proxy for it such as traffic) on a facility prior to ETC adoption. Alternatively, facilities might respond to a negative revenue shock by both raising tolls and adopting ETC, with the latter a way to lower revenue losses from the administrative costs of toll collection. This suggests we should observe declining revenue (or declining traffic) on a facility in the years prior to ETC adoption. More generally, we can look for changes in toll rates in the years prior to ETC adoption as a partial test of the identifying assumption that absent the adoption of ETC, a facility would not have experienced differential changes in its toll rate. Of course, if the lower salience of ETC 21. The primary source of data was facility Web pages and annual reports, which often provide detailed histories of work on the facilities. The level of detail and the nature of the projects reported vary across facilities. However, because all of the analysis is within-facility, this should not pose a problem.
1001
E-ZTAX: TAX SALIENCE AND TAX RATES TABLE VI CHANGES IN TRAFFIC, REVENUE, AND TOLLS PRIOR TO ETC ADOPTION Dep. var.: log(traffic) (1) 1–2 years before −0.000 ETCAdoptedit (0.007) [.955] 1–5 years before ETCAdoptedit
(2)
0.013 (0.010) [.198] ETCAdoptit −0.000 0.000 (0.010) (0.010) [.996] [.978] ETCit −0.006 −0.001 (0.010) (0.010) [.551] [.959] Mean dep. var 0.049 # of states 21 # op. authorities 32 # of facilities 76 N 2,200
Dep. var.: log(revenue) (3) −0.009 (0.016) [.599]
(4)
Dep. var.: log(minimum toll) (5)
(6)
0.004 (0.013) [.777]
0.006 0.009 (0.012) (0.007) [.601] [.242] 0.002 0.002 −0.051 −0.051 (0.025) (0.025) (0.035) (0.035) [.922] [.930] [.158] [.162] 0.028 0.031 0.016 0.017 (0.015) (0.015) (0.006) (0.006) [.090] [.058] [.018] [.008] 0.077 0.020 13 24 19 49 45 123 1,411 5,079
Notes. Table reports results from estimating variants of (17) by OLS. Dependent variables are defined in the column headings. In addition to the covariates shown in the table, all regressions include year fixed effects. Each operating authority receives equal weight. Standard errors (in parentheses) are clustered by state. p-values are reported in square brackets. “1–2 years before ETCAdoptedit ” is an indicator variable for whether it is one to two years before the facility adopts ETC. “1–5 years before ETCAdoptedit ” is an indicator variable for whether it is one to five years before the facility adopts ETC. ETCAdoptit is an indicator variable for whether facility i adopted ETC in year t. ETCit is an indicator variable for whether the facility has ETC; it is 1 in the year that ETC is adopted and in all subsequent years.
made it easier to raise tolls, ETC might be adopted precisely by facilities that were encountering difficulties in making needed toll increases, suggesting that facilities might experience declines in traffic, revenue, or toll increases prior to ETC adoption. Although evidence of such effects would therefore not necessarily be inconsistent with the salience story, the lack of any such evidence reduces concerns about omitted variable bias and spurious findings. Table VI shows the results. I reestimate (17) with three different dependent variables: log(traffic)it (columns (1) and (2)), log(revenue)it (columns (3) and (4)), and log(minimum toll)it (columns (5) and (6)). In addition to the standard regressors (year fixed effects, ETCAdoptit , and ETCit ), I also include an indicator variable for whether it is one to two years prior to ETC adoption (odd columns) or whether it is one to five years prior to ETC
1002
QUARTERLY JOURNAL OF ECONOMICS
adoption (even columns). The coefficients on these indicator variables for years just prior to ETC adoption show no statistically or substantively significant evidence of systematic changes in traffic, revenue, or tolls in the years prior to a facility’s adopting ETC. These results are consistent with the results from estimating (18a), which show no systematic preexisting trend in toll rates prior to a facility’s adoption of ETC, particularly in the balanced panel (see Figures IIIA and IIIB). One reason that the various endogeneity concerns may not in practice be a problem is that, as noted in Section IV.B, the different facilities run by a given operating authority tend to adopt ETC all at the same time, and yet may be experiencing different patterns of traffic and tolls.22 There are several other results of interest in Table VI. The finding in columns (3) and (4) that revenue increases by about 3 percent per year under ETC is broadly consistent with the estimated increase in tolls under ETC and the finding that demand for driving is very inelastic with respect to the toll.23 There is also some suggestive evidence in columns (1) and (2) that traffic declines under ETC, although these estimates are not statistically significant and are substantively quite small; a decline in traffic would be consistent with the survey evidence in Section III of overestimation of toll levels by ETC users. VI.B. The Impact of ETC on the Politics of Toll Setting The model in Section II.B suggested two potential mechanisms behind a finding that reduced salience is associated with increased tax rates: (i) a reduced behavioral responsiveness to taxes and (ii) a reduction in the political costs of tolls, particularly in the differential political costs of tolls in election years compared to nonelection years. Section V presented evidence for the first potential mechanism. To investigate the political channel, I examine whether there are political costs to tolls and how these costs change under ETC. Table VII shows the results. Because the political fallout from raising tolls may be concentrated on the extensive margin (i.e., whether tolls are raised), I report results not only for the baseline 22. In a different context, Dusek (2003) examines the impact of the introduction of state income tax withholding on tax rates, but notes that the decision to introduce income tax withholding appears to be correlated with increased demand for bigger government, making the results hard to interpret. 23. For the sample for which I have revenue data, I estimate that ETC is associated with a 2.2% increase in tolls each year (not shown).
E-ZTAX: TAX SALIENCE AND TAX RATES
1003
TABLE VII THE IMPACT OF ETC ON THE POLITICS OF TOLL SETTING
ETCit
AnyElec Yearst
log min toll (1)
Min toll raised? (2)
log min. toll (3)
Min toll raised? (4)
log min. toll (5)
Min toll raised? (6)
0.015 (0.006) [.018]
0.073 (0.024) [.006]
0.006 (0.009) [.507] −0.016 (0.004) [.000]
0.044 (0.022) [.042] −0.029 (0.010) [.003]
0.006 (0.009) [.494]
0.044 (0.022) [.042]
−0.016 (0.005) [.001] −0.015 (0.005) [.005]
−0.036 (0.012) [.002] −0.021 (0.012) [.085]
0.004 (0.014) [.791] 0.030 (0.014) [.038]
0.016 (0.033) [.617] 0.094 (0.033) [.005]
GovElec Yearst LegOnly ElecYearst AnyElec Yearst *ETCit GovElec Yearst *ETCit LegOnly ElecYearst *ETCit
0.017 (0.012) [.140]
0.055 (0.027) [.041]
Notes. Columns (1) and (2) report estimates of (17); columns (3)–(6) report estimates of (20). Dependent variable (shown in column heading) is log minimum toll (odd columns) or an indicator variable for whether the minimum toll was raised (even columns). In addition to the covariates shown in the table, all regressions include year fixed effects, ETCAdoptit , and interactions between ETCAdoptit and any indicator variables for the election year included in the regression. Each operating authority receives equal weight. Standard errors (in parentheses) are clustered by state. p-values are in square brackets. “AnyElecYearst ” is an indicator variable for whether state s’s governor or legislature is up for election in year t. “GovElecYearst ” is an indicator variable for whether the governor (and therefore almost always the legislature as well) is up for election. “LegOnlyElecYearst ” is an indicator variable for whether only the legislature is up for election. ETCit is an indicator variable for whether the facility has ETC; it is 1 in the year that ETC is adopted and in all subsequent years. Sample size in all columns is 5,079 facility-years, 123 facilities, 49 operating authorities, and 24 states. The mean of the dependent variable is 0.020 (odd columns) and 0.077 (even columns).
dependent variable log minimum toll (odd columns) but also for the binary dependent variable of whether the minimum toll increased (even columns). Column (1) replicates the baseline results from (17) (see Table IV, column (1)). Column (2) shows the results from estimating (17) with the binary dependent variable for whether the minimum toll was raised that year; the coefficient on ETCit is 0.073 (standard error 0.024). This suggests that, relative to the baseline 7.7% annual probability of a toll increase, the probability of a toll increase almost doubles on a facility once it
1004
QUARTERLY JOURNAL OF ECONOMICS
has ETC. Combined with the evidence in column (1), this suggests that the increase in tolls associated with ETC comes about primarily through more frequent toll increases of similar magnitude. I then expand the baseline specification in (17) to include indicator variables for whether it is an election year, and the interactions of these indicators with the change in salience, as proposed in the estimating (15) from Section II.B. This allows me to examine whether there is a political business cycle in toll setting and whether this political business cycle varies under manual toll collection and ETC. Specifically, I estimate yit = γt + β1 ETCAdoptit + β2 ETCit + β3 1(ElecYear)st (20)
+ β4 1(ElecYear)st ∗ ETCAdoptit + β5 1(ElecYear)st ∗ ETCit + εit .
Columns (3) and (4) report results when 1(ElecYear)st is an indicator for whether there is any state election (for either the governor or the legislature) in state s and year t; about half of the facility-years in the data are election years, but the timing of the electoral calendar varies across states. Columns (5) and (6) report results when 1(ElecYear)st is two separate indicators for whether the governor (and therefore almost always the legislature as well) is up for election and for whether only the legislature is up for election; each of these indicator variables is turned on in roughly one-fourth of state years. In all four specifications, the coefficients on all of the election year indicators are negative and statistically significant; this demonstrates the political business cycle under manual toll collection. Given the average annual 2% increase in tolls, the coefficient on the election year dummies of about −0.016 in columns (3) and (5) indicates that toll increases are about 75% lower during election years than during nonelection years under manual toll collection. The interaction term between the election year indicator variables and ETC is always positive; it is statistically significant for legislature-only election years (columns (5) and (6)) and statistically significant (or only marginally insignificant) for any election year (columns (3) and (4)). This suggests that under ETC, tollsetting behavior is less sensitive to the political election calendar (particularly legislature elections) than under manual toll collection. Indeed, there is no evidence that toll increases are lower in election years relative to nonelection years under electronic toll
E-ZTAX: TAX SALIENCE AND TAX RATES
1005
collection; the sum of the coefficients on the election year indicator variable and its interaction with ETC (i.e., β3 + β5 ) is almost always positive (and never significantly negative).24 VII. ALTERNATIVE EXPLANATIONS In this section, I briefly consider a range of alternative explanations for the increase in tolls associated with ETC other than the decline in the salience of the toll. I note at the outset that a general point in favor of the salience-based explanation is the finding that toll setting becomes less sensitive to the local election calendar under ETC; this is consistent with a decline in salience reducing the political costs of raising tolls, but would not be predicted by any of the alternative explanations I discuss. VII.A. ETC Lowers the Operating Cost of Toll Collection ETC is associated with substantial reductions in the annual costs of operating and maintaining toll facilities; the ETC cost savings come primarily from reductions in the labor costs associated with manual toll collection (Hau 1992; Pietrzyk and Mierzejewski 1993; Levinson 2002).25 However, for increases in the efficiency of tax collection to increase the equilibrium tax rate requires an improvement in the marginal efficiency of tax collection (Becker and Mulligan 2003). By contrast, ETC improves the fixed component of the efficiency cost of taxation—because the administrative cost savings are independent of the toll rate—which should therefore not prompt an increase in the rate of existing taxes.26 A decline in the fixed administrative costs of tax collection could, however, encourage the introduction of new taxes, such as 24. The “main effect” of ETC, although positive, is no longer statistically significant in columns (3) and (5); toll increases are not statistically significantly larger in nonelection years under ETC than under manual toll collection. However, toll increases are statistically significantly larger in election years under ETC than under manual toll collection; the sum of the coefficients on ETC and the interaction of ETC and election year (i.e., β2 + β5 in (20)) is statistically significant in column (3) and statistically significant for the legislative election year variable in column (5) (not shown). 25. Toll collection costs under manual toll collection can be quite high. A 1995 study of turnpikes in Massachusetts and New Jersey estimated that toll collection costs under manual toll collection were about 6 percent of toll revenue (Friedman and Waldfogel 1995); a 2006 study found that on portions of the Massachusetts Turnpike where there is relatively little traffic, toll collection costs were over onethird of toll revenue (Kriss 2006). 26. Note, moreover, that if operating authorities set tolls to meet an exogenous revenue requirement, the reduction in administrative costs would lower (rather than raise) the equilibrium toll needed to raise a fixed amount of (net) revenue.
1006
QUARTERLY JOURNAL OF ECONOMICS
the introduction of tolls on roads that had not been previously been tolled or the construction of new (tolled) roads where no road existed before. Any such effects of ETC, however, would not show up in my analysis, which limits the sample to facilities with preexisting tolls. Lower fixed administrative costs of toll collection could also encourage the installation of more toll collection points on an existing toll facility; however, I find no evidence that ETC had such an effect.27 VII.B. ETC Installation Requires Capital Outlay Although ETC lowers the costs of operating and maintaining toll facilities, installation of ETC requires a capital outlay. It seems unlikely that this capital outlay would require an increase in tolls. Operating authorities can borrow to cover these capital costs, and the capital costs are recouped within a few years by the savings in operating and maintenance costs, and by revenue from the sale or lease of the transponders and interest on prepayments and deposits (Hau 1992; Pietrzyk and Mierzejewski 1993). Of course, it is possible that operating authorities might use the installation costs of ETC as an excuse to raise tolls, even though ETC is selffinancing. Any such excuse might be used for a one-time increase in tolls when ETC comes in; it seems less natural that this excuse could be used for subsequent increases in tolls as ETC use diffuses among drivers. VII.C. Changes in Menu Costs Associated with ETC It is possible that ETC lowers the administrative (menu) cost of toll changes. There could be literal menu cost savings if signs listing the toll rate no longer had to be changed under ETC. Alternatively, ETC might allow smaller increases of non-“round” amounts; unlike manual tolls, this would not impose on drivers that they carry small coins. In practice, however, ETC tolls are not less “round” than manual tolls, except when they are specified as a fixed percentage discount off of the manual toll. In addition, the increase in tolls associated with ETC persists for the subsample of facilities that do not offer discounts; for these facilities, there can be no menu cost savings, as changing the electronic toll 27. I reestimate (17) using as a dependent variable a binary measure for whether there is an increase in the number of toll transactions someone driving a one-way, full-length trip on the facility would have to make. I perform this analysis for the full sample of facilities, and separately both for roads and for bridges and tunnels (not shown).
E-ZTAX: TAX SALIENCE AND TAX RATES
1007
requires changing the manual toll, and all facilities continue to have at least some manual payers. Finally, even if ETC did reduce menu costs, this should suggest that ETC would be associated with more frequent toll adjustments, but it is not clear why this would produce a higher equilibrium toll rate. VII.D. ETC Lowers Personal Compliance Costs of Toll Payment ETC reduces the drivers’ compliance costs of paying tolls (Hau 1992; Levinson 2002). Friedman and Waldfogel (1995) estimate that under manual toll collection, these compliance costs—which consist of time spent queuing and paying tolls at the toll plaza— are, on average, about 15% of toll revenue. Reductions in compliance costs of paying tolls may directly increase drivers’ willingness to pay the (monetary) toll, and hence provide an alternative explanation for the observed increase in toll rates. In practice, however, two independent pieces of empirical evidence suggest that toll authorities do not increase tolls in response to reductions in compliance costs; this is consistent with the finding in Section V that they set tolls substantially below the revenue-maximizing rate (i.e., that they implicitly place a relatively large weight on consumer surplus). The first piece of suggestive evidence comes from variation across roads in the number of times an individual must make a toll transaction, and hence variation in the compliance costs savings from ETC. For example, in 1985 an individual made eleven toll transactions while driving the length of the Garden State Parkway, compared to only two on the New Jersey Turnpike. If tolls were increased under ETC in response to the reductions in compliance costs, we would expect greater toll increases on roads with a greater number of toll transactions. In fact, I find weak evidence of the opposite. The second piece of suggestive evidence comes from what happens to toll rates when a bridge or tunnel switches from collecting tolls at both ends of the facility to collecting tolls at only one end; at various times over the course of my sample, about half of the bridges and tunnels (40 of 79) made this switch, which reduced compliance costs on their facility by one-half. I find little evidence of a substantively or statistically significant increase in tolls on a facility following this reduction in compliance costs.28 28. The results of both of these analyses are presented in more detail in the Online Appendix (Section C) and in the working paper version of this paper (Finkelstein 2007).
1008
QUARTERLY JOURNAL OF ECONOMICS
VII.E. ETC Raises the Optimal Congestion-Correcting Toll Could the increase in tolls under ETC come entirely from the increase in the optimal congestion externality–reducing toll that results from the reduced consumer responsiveness to tolls? This would suggest that the effect of ETC on toll rates is a salience effect, but one that comes entirely from a reduction in salience at the time of consumption (driving). This seems unlikely given the evidence in Section VI.B that ETC affects the political costs of raising tolls; this suggests that at least some of the toll increase associated with ETC is likely to be due to a decline in voting salience. In addition, as an (admittedly quite) crude test of whether the increase in tolls under ETC is driven by an increase in congestion under ETC, I experimented with controlling for traffic (a proxy for congestion) on the right-hand side of (17). I found that the impact of ETC on the change in tolls is not sensitive to including traffic as a control, suggesting that even conditional on the level of traffic, tolls still rise under ETC (not shown).
VIII. CONCLUSIONS This paper has examined the hypothesis that a less salient tax system can produce a higher equilibrium tax rate. Belief in this possibility has contributed to opposition to tax reforms that are believed to reduce tax salience, such as Federal income tax withholding or partial replacement of the income tax with a valueadded tax. Yet the sign of the effect of tax salience on tax rates is theoretically ambiguous, and empirical evidence has been lacking. I examine the relationship between tax salience and tax rates empirically by looking at the impact of electronic toll collection (ETC) on toll rates. Survey evidence indicates that drivers who pay tolls electronically are substantially less aware of toll rates than those who pay with cash, suggesting that ETC reduces tolls’ salience. To analyze the impact of this reduction in salience, I assembled a new data set on toll rates over the last half century on 123 toll facilities in the United States. Because different toll facilities adopted ETC in different years, and some have not yet adopted it, I am able to examine the within-toll facility change in tolls associated with the introduction of electronic toll collection. I find robust evidence that toll rates increase following the adoption of electronic toll collection. The estimates suggest that after ETC use among drivers has diffused to its steady state level,
E-ZTAX: TAX SALIENCE AND TAX RATES
1009
toll rates are 20 to 40 percent higher than they would have been under manual toll collection. I provide evidence of two potential mechanisms by which reduced salience may contribute to increased toll rates: under ETC driving behavior becomes less elastic (in absolute value) with respect to the toll, and toll setting becomes less sensitive to the local election calendar. This decline in the political costs of raising tolls associated with ETC would not be predicted by alternative explanations for the increase in tolls associated with ETC. I also present additional evidence that is not consistent with specific alternative explanations. As previously discussed, the normative implications of these findings are ambiguous. Evidence on what is done with the extra revenue from the higher tolls—in particular, whether it is used for purposes that may be valued by users of the facility such as infrastructure investment or reductions in other highway fees, or whether it primarily serves to increase rents for the governing authority through increased employment or salaries of bureaucrats—could help shed some light on the normative implications of the higher tolls under ETC. Unfortunately, the available data are not sufficient for analysis of this issue. The results also leave open the question of how tax salience affects tax rates in other contexts, such as federal income tax withholding or the replacement of a sales tax with a value added tax. As previously discussed, the sign of the effect of tax salience on tax rates may well differ for taxes that are a larger share of expenditures than tolls. The magnitude of any effect of tax salience is also likely to differ across different political institutions. The results in this paper suggest that the salience of the tax instrument is an important element to consider in both theoretical and empirical investigations of the political economy of tax setting. Relatedly, they suggest that the empirical impact of tax salience in these other specific settings is an interesting and important direction for further work. MASSACHUSETTS INSTITUTE OF TECHNOLOGY AND NBER
REFERENCES Becker, Gary, and Casey Mulligan, “Deadweight Costs and the Size of Government,” Journal of Law and Economics, 46 (2003), 293–340. Brennan, Geoffrey, and James Buchanan, The Power to Tax: Analytical Foundations of a Fiscal Constitution (Cambridge, UK: Cambridge University Press, 1980).
1010
QUARTERLY JOURNAL OF ECONOMICS
Buchanan, James, Public Finance in Democratic Process: Fiscal Institutions and the Individual Choice (Chapel Hill: University of North Carolina Press, 1967). Buchanan, James, and Richard E. Wagner, Democracy in Deficit (New York: Academic Press, 1977). Chetty, Raj, Kory Kroft, and Adam Looney, “Salience and Taxation: Theory and Evidence,” American Economic Review, forthcoming. Dollery, Brian, and Andrew Worthington, “The Empirical Analysis of Fiscal Illusion,” Journal of Economic Surveys, 10 (1996), 261–298. Dusek, Libor, Do Governments Grow When They Become More Efficient? Evidence from Tax Withholding, Unpublished Ph.D. Dissertation, University of Chicago, 2003. Feldman, Naomi, and Peter Katuscak, “Should the Average Tax Rate Be Marginalized?” mimeo, Ben Gurion University, 2005. Finkelstein, Amy, “E-Z Tax: Tax Salience and Tax Rates,” Working Paper 12924, National Bureau of Economic Research, 2007. Friedman, David, and Joel Waldfogel, “The Administrative and Compliance Cost of Manual Highway Toll Collection: Evidence from Massachusetts and New Jersey,” National Tax Journal, June 1995. Friedman, Milton, and Rose Friedman, Two Lucky People (Chicago: University of Chicago Press, 1998). Hau, Timothy, “Congestion Charging Mechanisms for Roads: An Evaluation of Current Practice,” Research Paper 1071, The World Bank, 1992. Holguin-Veras, Jose, Ozbay Kaan, and Allison de Cerrano, “Evaluation Study of the Port Authority of New York and New Jersey’s Time of Day Pricing Initiative,” report, Rensselaer Polytechnic Institute, 2005. Kearney, Melissa, “The Economic Winners and Losers of Legalized Gambling,” National Tax Journal, 58 (2005), 281–302. Kriss, Eric, “Turnpike Task Force Final Report,” Pioneer Institute board presentation by Eric Kriss, October 18, 2006. Levinson, David, Financing Transportation Networks (Northampton, MA: Edward Elgar, 2002). Liebman, Jeffrey, and Richard Zeckhauser, “Schmeduling,” unpublished mimeo, Harvard’s Kennedy School of Government, 2004). Nordhaus, William, “The Political Business Cycle,” Review of Economic Studies, 42 (1975), 169–190. Oates, Wallace, “On the Nature and Measurement of Fiscal Illusion: A Survey,” in Taxation and Fiscal Federalism: Essays in Honour of Russell Mathews, Geoffrey Brennan, Bhajan Grewal, and Peter Groenewegen, eds. (Sydney: Australian National University Press, 1988). Peltzman, Sam, “Pricing in Public and Private Enterprises: Electric Utilities in the United States,” Journal of Law and Economics, 14 (1971), 109–147. Pietrzyk, Michael, and Edward Mierzejewski, “Electronic Toll and Traffic Management Systems” (Washington, DC: National Academy Press, 1993). Slemrod, Joel, “Which Is the Simplest Tax System of Them All?” in The Economics of Fundamental Tax Reform, Hank Aaron and William Gale, eds. (Washington, DC: The Brookings Institution, 1996). Soman, Dilip, “Effects of Payment Mechanism on Spending Behavior: The Role of Rehearsal and Immediacy of Payments,” Journal of Consumer Research, 27 (2001), 460–474. Thaler, Richard H., “Mental Accounting Matters,” Journal of Behavioral Decision Making, 12 (1999), 183–206. The President’s Advisory Panel on Federal Tax Reform, Simple, Fair, and ProGrowth: Proposals to Fix America’s Tax System (Washington, DC: U.S. Government Printing Office, 2005). U.S. Census Bureau, “Annual Survey of State and Local Government Finances and Census of Governments,” 1985. U.S. Department of Transportation, “Highway Statistics, YEAR,” various years. Wooldridge, Jeffrey, Econometric Analysis of Cross Section and Panel Data (Cambridge, MA: MIT Press, 2002).
THE BOND MARKET’S q∗ THOMAS PHILIPPON I propose an implementation of the q-theory of investment using bond prices instead of equity prices. Credit risk makes corporate bond prices sensitive to future asset values, and q can be inferred from bond prices. With aggregate U.S. data, the bond market’s q fits the investment equation six times better than the usual measure of q, it drives out cash flows, and it reduces the implied adjustment costs by more than an order of magnitude. Theoretical interpretations for these results are discussed.
I. INTRODUCTION In his 1969 article, James Tobin argued that “the rate of investment—the speed at which investors wish to increase the capital stock—should be related, if to anything, to q, the value of capital relative to its replacement cost” (Tobin 1969, p. 21). Tobin also recognized, however, that q must depend on “expectations, estimates of risk, attitudes towards risk, and a host of other factors,” and he concluded that “it is not to be expected that the essential impact of [. . . ] financial events will be easy to measure in the absence of direct observation of the relevant variables (q in the models).” The quest for an observable proxy for q was therefore recognized as a crucial objective from the very beginning. Subsequent research succeeded in integrating Tobin’s approach with the neoclassical investment theory of Jorgenson (1963). Lucas and Prescott (1971) proposed a dynamic model of investment with convex adjustment costs, and Abel (1979) showed that the rate of investment is optimal when the marginal cost of installment is equal to q − 1. Finally, Hayashi (1982) showed that, under perfect competition and constant returns to scale, marginal q (the market value of an additional unit of capital divided by its replacement cost) is equal to average q (the market value of existing capital divided by its replacement cost). Because average q is observable, the theory became empirically relevant. ∗ This paper was first circulated under the title “The y-Theory of Investment.” I thank Robert Barro (the editor), three anonymous referees, Daron Acemoglu, Mark Aguiar, Manuel Amador, Luca Benzoni, Olivier Blanchard, Xavier Gabaix, Mark Gertler, Simon Gilchrist, Bob Hall, Guido Lorenzoni, Sydney Ludvigson, Pete Kyle, Lasse Pedersen, Christina Romer, David Romer, Ivan Werning, Toni Whited, Jeff Wurgler, Egon Zakrajsek, and seminar participants at NYU, MIT, the SED 2007, London Business School, Ente Einaudi (Rome), University of Salerno, Toulouse University, Duke University, and the NBER Summer Institutes 2006 and 2007. Peter Gross provided excellent research assistance. C 2009 by the President and Fellows of Harvard College and the Massachusetts Institute of
Technology. The Quarterly Journal of Economics, August 2009
1011
1012
QUARTERLY JOURNAL OF ECONOMICS
Unfortunately, its implementation proved disappointing. The investment equation fits poorly, leaves large unexplained residuals correlated with cash flows, and implies implausible parameters for the adjustment cost function (see Summers [1981] for an early contribution, and Hassett and Hubbard [1997] and Caballero [1999] for recent literature reviews). Several theories have been proposed to explain this failure. Firms could have market power, and might not operate under constant returns to scale. Adjustment costs might not be convex (Dixit and Pindyck 1994; Caballero and Engle 1999). Firms might be credit-constrained (Fazzari, Hubbard, and Petersen 1988; Bernanke and Gertler 1989). Finally, there could be measurement errors and aggregation biases in the capital stock or the rate of investment. None of these explanations is fully satisfactory, however. The evidence for constant returns and price-taking seems quite strong (Hall 2003). Adjustment costs are certainly not convex at the plant level, but it is not clear that it really matters in the aggregate (Thomas 2002; Hall 2004), although this is still a controversial issue (Bachmann, Caballero, and Engel 2006). Gomes (2001) shows that Tobin’s q should capture most of investment dynamics even when there are credit constraints. Heterogeneity and aggregation do not seem to create strong biases (Hall 2004). In fact, an intriguing message comes out of the more recent empirical research: the market value of equity seems to be the culprit for the empirical failure of the investment equation. Gilchrist and Himmelberg (1995), following Abel and Blanchard (1986), use VARs to forecast cash flows and to construct q, and they find that it performs better than the traditional measure based on equity prices. Cumins, Hasset, and Oliner (2006) use analysts’ forecasts instead of VAR forecasts and reach similar conclusions. Erickson and Whited (2000, 2006) use GMM estimators to purge q from measurement errors. They find that only 40% of observed variations are due to fundamental changes, and, once again, that market values contain large “measurement errors.” Applied research has therefore reached an uncomfortable situation, where the benchmark investment equation appears to be successful only when market prices are not used to construct q. This is unfortunate, because Tobin’s insight was precisely to link observed quantities and market prices. The contribution of this paper is to show that a market-based measure of q can be constructed from corporate bond prices and that this measure performs much better than the traditional one.
THE BOND MARKET’S q
1013
Why would the bond market’s q perform better than the usual measure? There are several possible explanations, two of which are discussed in details in this paper. The first explanation is that total firm value includes the value of growth options, that is, opportunities to expand into new areas and new technologies. With enough skewness, these growth options end up affecting equity prices much more than bond prices. If, in addition, these growth options are unrelated to existing operations, they do not affect current capital expenditures. As a result, bond prices are more closely related to the existing technology’s q, while equity prices reflect organizational rents. A second possible explanation is that the bond market is less susceptible to bubbles than the equity market. In fact, there is empirical and theoretical support for the idea that mispricing is more likely to happen when returns are positively skewed. Barberis and Huang (2007) show that cumulative prospect theory can explain how a positively skewed security becomes overpriced. Brunnermeier, Gollier, and Parker (2007) argue that preference for skewness arises endogenously because investors choose to be optimistic about the states associated with the most skewed Arrow–Debreu securities. Empirically, Mitton and Vorkink (2007) document that underdiversification is largely explained by the fact that investors sacrifice mean–variance efficiency for higher skewness exposure. These insights, combined with the work of Stein (1996) and Gilchrist, Himmelberg, and Huberman (2005) showing why rational managers might not react (or, at least, not much) to asset bubbles, provide another class of explanations.1 Of course, even if we accept the idea that bond prices are somehow more reliable than equity prices, it is far from obvious that it is actually possible to use bond prices to construct q. The contribution of this paper is precisely to show how one can do so, by combining the insights of Black and Scholes (1973) and Merton (1974) with the approach of Abel (1979) and Hayashi (1982). In the Black–Scholes–Merton model, debt and equity are seen as derivatives of the underlying assets. In the simplest case, the market value of corporate debt is a function of its face value, asset 1. Other rational explanations can also be proposed. These explanations typically involve different degrees of asymmetric information, market segmentation, and heterogeneity in adjustment costs and stochastic processes. For instance, firms might be reluctant to use equity to finance capital expenditures, because of adverse selection, in which case the bond market might provide a better measure of investment opportunities (Myers 1984). It is much too early at this stage to take a stand on which explanations are most relevant.
1014
QUARTERLY JOURNAL OF ECONOMICS
volatility, and asset value. But one can also invert the function, so that, given asset volatility and the face value of debt, one can construct an estimate of asset value from observed bond prices. I extend this logic to the case where asset value is endogenously determined by capital expenditures decisions. As in Hayashi (1982), I assume constant returns to scale, perfect competition, and convex adjustment costs. There are no taxes and no bankruptcy costs, so the Modigliani–Miller theorem holds, and real investment decisions are independent from capital structure decisions.2 Firms issue long-term, coupon-paying bonds as in Leland (1998), and the default boundary is endogenously determined to maximize equity value, as in Leland and Toft (1996). There are two crucial differences between my model and the usual asset pricing models. First, physical assets change over time. Under constant returns to scale, however, I obtain tractable pricing formulas, where the usual variables are simply scaled by the book value of assets. Thus, book leverage plays the role of the face value of principal outstanding, and q plays the role of total asset value. The second difference is that cash flows are endogenous, because they depend on adjustment costs and investment decisions. I model an economy with a continuum of firms hit by aggregate and idiosyncratic shocks. Even though default is a discrete event at the firm level, the aggregate default rate is a continuous function of the state of the economy. To build economic intuition, I consider first a simple example with one-period debt, constant risk-free rates, and i.i.d. firm-level shocks. I find that, to first order (i.e., for small aggregate shocks), Tobin’s q is a linear function of the spread of corporate bonds over government bonds. The sensitivity of q to bond spreads depends on the risk-neutral default rate, just like the delta of an option in the Black–Scholes formula. In the general case, I choose the parameters of the model to match aggregate and firm level dynamics, estimated with postwar U.S. data. Given book leverage and idiosyncratic volatility, the model produces a nonlinear mapping from bond prices to q. I then use the theoretical mapping to construct a time series for q based on the relative prices of corporate and government bonds, taking into account trends in book leverage and 2. One could introduce taxes and bankruptcy costs if one wanted to derive an optimal capital structure, but this is not the focus of this paper. See Hackbarth, Miao, and Morellec (2006) for such an analysis, with a focus on macroeconomic risk.
THE BOND MARKET’S q
1015
idiosyncractic risk, as well as changes in real risk-free rates. This bond market’s q fits the investment equation quite well with postwar aggregate U.S. data. The R2 is around 60%, cash flows become insignificant, and the implied adjustment costs are more than an order of magnitude smaller than with the usual measure of q. The fit is as good in levels as in differences. The theoretical predictions for the roles of leverage and volatility are supported by the data, as well as the nonlinearities implied by the model. Using simulations, I find that the predictions of the model are robust to specification errors, as well as to taxes and bankruptcy costs. The theoretical predictions for firm level dynamics are consistent with the empirical results of Gilchrist and Zakrajsek (2007), who show that firm-specific interest rates forecast firmlevel investment. The remainder of the paper is organized as follows. Section II presents the setup of the model. Section III uses a simple example to build economic intuition. Section IV presents the numerical solution for the general case. Section V presents the evidence for aggregate U.S. data. Section VI discusses the theoretical interpretations of the results. Section VII discusses the robustness of the results to various changes in the specification of the model. Section VIII concludes. II. MODEL II.A. Firm Value and Investment Time is discrete and runs from t = 0 to ∞. The production technology has constant returns to scale and all markets are perfectly competitive. All factors of production, except physical capital, can be freely adjusted within each period. Physical capital is predetermined in period t and, to make this clear, I denote it by kt−1 . Once other inputs have been chosen optimally, the firm’s profits are therefore equal to pt kt−1 , where pt is the exogenous profit rate in period t. Let the function (kt−1 , kt ) capture the total cost of adjusting the level of capital from kt−1 to kt . For convenience, I include depreciation in the function , and I assume that it is homogeneous of degree one, as in Hayashi (1982).3 3. For instance, the often-used case of quadratic adjustment costs corresponds to (kt , kt+1 ) = kt+1 − (1 − d)kt + 0.5γ2 (kt+1 − kt )2 /kt , where d is the depreciation rate, and γ2 is a constant that pins down the curvature of the adjustment cost function.
1016
QUARTERLY JOURNAL OF ECONOMICS
Let rt be the one-period real interest rate, and let Eπ [.] denote expectations under the risk-neutral probability measure π .4 The state of the firm at time t is characterized by the endogenous state variable kt and a vector of exogenous state variables ωt , which follows a Markov process under π . The profit rate and the risk-free rate are functions of ωt . The value of the firm solves the Bellman equation, Eπ [V (kt , ωt+1 ) |ωt ] V (kt−1 , ωt ) = max p(ωt )kt−1 − (kt−1 , kt ) + . kt ≥0 1 + r(ωt ) (1) Because the technology exhibits constant returns to scale, it is convenient to work with the scaled value function, vt ≡
(2)
Vt . kt−1
Similarly, define the growth of k as xt ≡ kt /kt−1 . After dividing both sides of equation (1) by kt−1 , and using the shortcut notation ω for ωt+1 , we obtain x Eπ [v(ω )|ω] , v(ω) = max p(ω) − γ (x) + (3) x≥0 1 + r(ω) where γ is the renormalized version of . The function γ is assumed to be convex and to satisfy limx→0 γ (x) = ∞ and limx→∞ γ (x) = ∞. The optimal investment rate x(ω) solves (4)
∂γ Eπ [v(ω )|ω] (x(ω)) = q(ω) ≡ . ∂x 1 + r(ω)
Equation (4) defines the q-theory of investment: it says that the marginal cost of investment is equal to the expected discounted marginal product of capital. The most important practical issue is the construction of the right-hand side of equation (4). II.B. Measuring q The value of the firm is the value of its debt plus the value of its equity. Let Bt be the market values of the bonds outstanding 4. This is equivalent to using a pricing kernel, but it simplifies the notations and the algebra. If m is the pricing kernel between states ω and ω , then for any random variable z , E[m z |ω] = Eπ [z |ω]/(1 + r(ω)). It is crucial to account for risk premia in any case. Berndt et al. (2005) show that objective probabilities of default are much smaller than risk-adjusted probabilities of default. Lettau and Ludvigson (2002) also emphasize the role of time-varying risk premia.
THE BOND MARKET’S q
1017
at the end of period t, and define bt as the value scaled by end-ofperiod physical assets (5)
bt ≡
Bt . kt
Similarly, let e (ω) be the ex-dividend value of equity, scaled by end-of-period assets. Then q is simply (6)
q(ω) = e(ω) + b(ω).
The most natural way to test the q-theory of investment is therefore to use equation (6) to construct the right-hand side of equation (4). Unfortunately, it fits poorly in practice (Summers 1981; Hassett and Hubbard 1997; Caballero 1999). Equation (6) has been estimated using aggregate and firm-level data, in levels or in first differences, with or without debt on the right-hand side. It leaves large unexplained residuals correlated with cash flows, and it implies implausible values for the adjustment cost function γ (x). As argued in the Introduction, there are potential explanations for this empirical failure, but none is really satisfactory. Moreover, a common finding of the recent research is that “measurement errors” in equity seem to be responsible for the failure of q-theory (Gilchrist and Himmelberg 1995; Erickson and Whited 2000, 2006; Cumins, Hasset, and Oliner 2006). I do not attempt in this paper to explain the meaning of these “measurement errors.” I simply argue that, even if equity prices do not provide a good measure of q, it is still possible to construct another one using observed bond prices. II.C. Corporate Debt I assume that there are no taxes and no deadweight losses from financial distress. The Modigliani–Miller theorem implies that leverage policy does not affect firm value or investment. Leverage does affect bond prices, however, and I must specify debt dynamics before I can use bond prices to estimate q. The model used here belongs to the class of structural models of debt with endogenous default boundary. In this class of models, default is chosen endogenously to maximize equity value (see Leland [2004] for an illuminating discussion). There are many different types of long-term liabilities, and my goal here is not to study all of them, but rather to focus on
1018
QUARTERLY JOURNAL OF ECONOMICS
a tractable model of long-term debt. To do so, I use a version of the exponential model introduced in Leland (1994), and used by Leland (1998) and Hackbarth, Miao, and Morellec (2006), among others. In this model, the firm continuously issues and retires bonds. Specifically, a fraction φ of the remaining principal is called at par every period. The retired bonds are replaced by new ones. To understand the timing of cash flows, consider a bond with coupon c and principal normalized to 1, issued at the end of period t. The promised cash flows for this particular bond are as follows: t+1
t+2
...
τ
...
c+φ
(1 − φ)(c + φ)
...
(1 − φ)τ −t−1 (c + φ)
...
Let τ −1 be the sum of the face values of all the bonds outstanding at the beginning of period τ . I use the index τ − 1 to make clear that this variable, just like physical capital, is predetermined at the beginning of each period. The timing of events in each period is the following: 1. The firm enters period τ with capital kτ −1 and total face value of outstanding bonds τ −1 . 2. The state variable ωτ is realized. The value of the firm is then Vτ = vτ kτ −1 , defined in equations (1) and (3). (a) If equity value falls to zero, the firm defaults and the bond holders recover Vτ . (b) Otherwise, the bond holders receive cash flows (c + φ ) τ −1 . 3. At the end of period τ , the capital stock is kτ , the face value of the bonds (including newly issued ones) is τ , and their market value is Bτ = bτ kτ . New issuances represent a principal of τ − (1 − φ ) τ −1 . In Leland (1994) and Leland (1998), book assets are constant, because there is no physical investment, and the firm simply chooses a constant face value . In my setup, the corresponding assumption is that the firm chooses a constant book leverage ratio. In the theoretical analysis, I therefore maintain the following assumption: ASSUMPTION. Firms keep a constant book leverage ratio: ψ ≡ t /kt . A bond issued at the end of period t has a remaining face value of (1 − φ)τ −t−1 at the beginning of period τ . In case of default
THE BOND MARKET’S q
1019
during period τ , all bonds are treated similarly and the bond issued at time t receives (1 − φ)τ −t−1 Vτ / τ −1 . Because all outstanding bonds are treated similarly in case of default, we can characterize the price without specifying when this principal was issued. The following proposition characterizes the debt pricing function. PROPOSITION 1. The scaled value of corporate debt solves the equation (7) b(ω) =
1 Eπ [min{(c + φ)ψ + (1 − φ)b(ω ); v(ω )}|ω]. 1 + r(ω)
Proof. See Appendix. The intuition behind equation (7) is relatively simple. Default happens when equity value falls to zero, that is, when v − (c + φ)ψ − (1 − φ)b = 0. There are no deadweight losses and bondholders simply recover the value of the company. When there is no default, bondholders receive the cash flows (c + φ)ψ and they own (1 − φ) remaining bonds. A few special cases are worth pointing out. Short-term debt corresponds to φ = 1 and c = 0, and the pricing function is simply (8)
bshort (ω) =
1 Eπ [min(ψ; v(ω ))|ω]. 1 + r(ω)
The main difference between short- and long-term debt is the presence of the pricing function b on both sides of equation (7), whereas it appears only on the left-hand side in equation (8). A perpetuity corresponds to φ = 0, and, more generally, 1/φ is the average maturity of the debt. The value of a default-free bond with the same coupon and maturity structure would be (9)
bfree (ω) =
(c + φ)ψ + (1 − φ)Eπ [bfree (ω )|ω] . 1 + r(ω)
With a constant risk-free rate, bfree is simply equal to (c + φ)ψ/ (φ + r). III. SIMPLE EXAMPLE This section presents a simple example in order to build intuition for the more general case. The specific assumptions made in this section, and relaxed later, are that the risk-free rate is constant; firms issue only short-term debt; and idiosyncratic shocks
1020
QUARTERLY JOURNAL OF ECONOMICS
are i.i.d. Let us first decompose the state ω into its aggregate component s, and its idiosyncratic component η. The aggregate state follows a discrete Markov chain over the set [1, 2, . . . , S], and it pins down the aggregate profit rate a (s), as well as the conditional risk-neutral expectations. The profit rate of the firm depends on the aggregate state and on the idiosyncratic shock: p(s, η) = a(s) + η.
(10)
The shocks η are independent over time, and distributed according to the density function ζ (.). Because idiosyncratic profitability shocks are i.i.d., the value function is additive and can be written v(s, η) = v(s) + η. I assume that s and η are such that v(s, η) is always positive, so that firms never exit. Tobin’s q is the same for all firms, and I normalize the mean of η to zero; therefore, q(s) =
(11)
Eπ [v(s )|s] . 1+r
Let v¯ ≡ Eπ [v(s)] be the unconditional risk-neutral average asset value, and define q ≡ v/(1 ¯ + r). All the firms choose the same investment rate in this simple example. This will not be true in the general model with persistent idiosyncratic shocks. We can write the value of the aggregate portfolio of corporate bonds by integrating (8) over idiosyncratic shocks: ψ−v(s ) 1 π E ψ+ (12) b(s) = (v(s ) + η − ψ)ζ (η ) dη |s . 1+r −∞ In equation (12), ψ is the promised payment, and the integral measures credit losses. Let δ be the default rate estimated at the risk-neutral average value ψ−v¯ (13) ζ (η ) dη . δ≡ ψ−v¯
−∞
¯ ≡ (ψ + ¯ + η − ψ)ζ (η ) dη )/(1 + r) be the correspondLet b −∞ (v ing price for the aggregate bond portfolio. Using (13) and (11), we can write (12) as (14)
π ¯ = δ(q(s) − q) + E [o(v )] , b(s) − b 1+r
ψ−v where o(v ) ≡ ψ−v¯ (v + η − ψ)ζ (η ) dη is first-order small, in the sense that o(v) ¯ = 0 and ∂o/∂v = 0 when evaluated at v. ¯ When
THE BOND MARKET’S q
1021
aggregate shocks are small, so that v stays relatively close to v, ¯ Eπ [o(v )] is negligible. Equation (14) is the equivalent of the Black–Scholes–Merton formula, applied to Tobin’s q. The value of the option (debt) depends on the value of the underlying (q), and the delta of the option is the probability of default. If this probability is exactly zero, bond prices do not contain information about q. The fact that the sensitivity of b to q is given by δ is intuitive. Indeed, b responds to q precisely because a fraction δ of firms default on average each period. A one-unit move in aggregate q therefore translates into a δ move in the price of a diversified portfolio of bonds. To make equation (14) empirically relevant, we need to express it in terms of bond yields. All the prices we have discussed so far are in real terms, but, in practice, we observe nominal yields. Let r $ be the nominal risk-free rate, and let y$ be the nominal yield on corporate bonds. With short-term debt, the market value is equal to the nominal face value divided by 1 + y$ . Under the assumption we have made in this section, and neglecting the terms that are first-order small, a simple manipulation of equation (14) leads to the following proposition. PROPOSITION 2. To a first-order approximation, Tobin’s q is a linear function of the relative yields of corporate and government bonds,
(15)
qt ≈
1 + rt$ ψ + constant, δ(1 + r) 1 + yt$
where r is the real risk-free rate, ψ is average book leverage, and δ is the risk-neutral default rate. The proposition sheds light on existing empirical studies, such as Bernanke (1983), Stock and Watson (1989), and Lettau and Ludvigson (2002), showing that the spread of corporate bonds over government bonds predicts future output.5 This finding is consistent with q-theory, because the proposition shows that corporate bond spreads are, to first order, proportional to Tobin’s q. 5. In the proposition, I use the relative bond price (the ratio) instead of the spread (the difference) because this is more accurate when inflation is high. The approximation of small aggregate shocks made in this section refers to real shocks, but does not require average inflation to be small.
1022
QUARTERLY JOURNAL OF ECONOMICS
IV. LONG-TERM DEBT AND PERSISTENT IDIOSYNCRATIC SHOCKS I now consider the case of long-term debt and persistent firmlevel shocks. The goal is to obtain a mapping from bond yields to Tobin’s q that extends the simple case presented above. As in the previous section, let s denote the aggregate state and let η denote the idiosyncratic component of the profit rate, defined in equation (10). With persistent idiosyncratic shocks, Tobin’s q and the investment rate depend on both s and η, and the value function is no longer additively separable. There is no closed-form solution for bond prices, and the approximation of Proposition 2 is cumbersome because of the fixed point problem in equation (7). I therefore turn directly to numerical simulations. I maintain for now the assumptions of a constant risk-free rate r and of constant book leverage ψ. I use a quadratic adjustment cost function: γ (x) = γ1 x + 0.5γ2 x 2 .
(16)
With this functional form, the investment equation is simply x = (q − γ1 ) /γ2 . Idiosyncratic profitability is assumed to follow an AR(1) process: η
ηt = ρη ηt−1 + ση εt .
(17)
Similarly, I specify aggregate dynamics as at − a¯ = ρa (at−1 − a) ¯ + σa εta .
(18) η
The shocks {εt }η∈[0,1] and εta follow independent normal distributions with zero mean and unit variance. The results discussed below are based on the following parameters: r 3%
ψ 0.45
φ 0.1
γ1 1
γ2 10
ρη 0.47
ση 14%
ρa 0.7
σa 4.5%
a/r ¯ 0.925
c 4.3%
Book leverage is set to 0.45 and average debt maturity to ten years (φ = 0.1), based on Leland (2004), who uses these values as benchmarks for Baa bonds. The parameter γ1 is irrelevant and is normalized to one in this section. There is much disagreement about the parameter γ2 in the literature. Shapiro (1986) estimates a value of around 2.2 years, and Hall (2004) finds even smaller adjustment costs.6 On the other hand, Gilchrist and Himmelberg 6. Shapiro (1986) estimates between 8 and 9 using quarterly data, which corresponds to 2 to 2.2 at annual frequencies.
THE BOND MARKET’S q
1023
(1995) find values of around twenty years, and estimates from macro data are often implausibly high (Summers 1981). I pick a value of γ2 = 10 years, which is in the middle of the set of existing estimates. It turns out, however, that the mapping from bond yields to q is not very sensitive to this parameter. The parameters of equations (17) and (18) are calibrated using U.S. firm and aggregate data, as explained in Section V. Finally, the coupon rate c is chosen so that bonds are issued at par value, as in Leland (1998). We can now use the model to understand the relationship between bond prices and Tobin’s q. The main idea of the paper is to use the price of corporate bonds relative to Treasury to construct a measure of q. The model is simulated with the parameters just described. The processes (17) and (18) are approximated with discrete-state Markov chains using the method in Tauchen (1986). The investment rate x (s, η) and the value of the firm value v (s, η) are obtained by solving the dynamic programming problem in equation (3). Equation (7) is then used to compute the bond pricing function b (s, η). The aggregate bond price b (s) and the aggregate corporate yield y (s) are obtained by integrating over the ergodic distribution of η. Figure I presents the main result. It shows the model-implied aggregate q (s) as a function of the model-implied average relative bond price (φ + r ) / (φ + y (s)). Figure I is generated by considering all the possible values of the aggregate state variable s. Tobin’s q is an increasing and convex function of the relative price of corporate bonds. Figure I therefore extends Proposition 2 to the case of long-term debt, persistent firm-level shocks, and large aggregate shocks. The mapping from bond yields to Tobin’s q is conditional on the calibrated parameters, in particular on book leverage and idiosyncratic volatility. Figure II shows the comparative statics with respect to book leverage (ψ) and firm volatility (ση ). The comparative statics is intuitive. For a given value of q, an increase in leverage leads to more credit risk and lower bond prices, so the mapping shifts left when leverage increases. Similarly, for a given value of q, an increase in idiosyncratic volatility increases credit risk, and the mapping shifts left when volatility increases. In this case, the slope and the curvature of the mapping also change, and the intuition is given by Proposition 2: idiosyncratic volatility increases the delta of the bond with respect to q. In the next section, mappings like the ones displayed in Figure II are used to construct a new measure of q from observed
1024
QUARTERLY JOURNAL OF ECONOMICS
1.4 1.3 1.2
Tobin q
1.1 1 0.9 0.8 0.7
0.76
0.78
0.8
0.9 0.82 0.84 0.86 0.88 Relative price of corporate bonds
0.92
0.94
0.96
FIGURE I Aggregate Tobin’s q and the Relative Price of Corporate Bonds The figure shows the implicit mapping between average bond prices and q across aggregate states (with different aggregate profit rates). The price of corporate bonds relative to risk-free bonds is defined as (0.1 + r)/(0.1 + y), where r is the risk-free rate and y is the average yield on corporate bonds. The factor 0.1 reflects the average maturity of 10 years. The mapping is for benchmark values of book leverage, idiosyncratic volatility, and a constant risk-free rate of 3% (see Table II).
bond yields, leverage, and volatility. With respect to leverage, it is important to emphasize the role played by the maintained assumptions of no taxes and no bankruptcy costs. These assumptions imply that capital structure is irrelevant for real decisions (i.e., investment) and for firm value (Modigliani and Miller 1958). Leverage is relevant for bond pricing, however. Bond prices depend on leverage in the same way that they do in the model of Merton (1974): higher leverage increases default risk and therefore decreases the relative price of corporate bonds. Thus, it is crucial to use a mapping that is conditional on leverage to recover the correct value of q. To see why, imagine a world where firms choose their leverage to stabilize their credit spreads. In this case, the correlation between spreads and investment could be arbitrarily small. This would not invalidate the construction of q, however, because the explanatory power would then come from observed changes in leverage. In terms of Figure II, firms would
1025
THE BOND MARKET’S q 1.5 1.4
Leverage 0.4 Leverage 0.5 Leverage 0.6
1.3 1.2
Tobin q
1.1 1 0.9 0.8 0.7 0.6
a 0.5 0.6
0.65
0.7
0.75 0.8 0.85 0.9 Relative price of corporate bonds
0.95
1
0.95
1
1.5 Firm σ 0.1 Firm σ 0.15 Firm σ 0.2
1.4 1.3 1.2
Tobin q
1.1 1 0.9 0.8 0.7 0.6
b 0.5 0.6
0.65
0.7
0.75 0.8 0.85 0.9 Relative price of corporate bonds
FIGURE II Impact of Leverage and Firm Volatility Calibration a in Figure I, except for book leverage in the top panel, and firm volatility in the bottom panel. (a) Mapping for different values of book leverage; (b) mapping for different volatilities of idiosyncratic shocks.
1026
QUARTERLY JOURNAL OF ECONOMICS TABLE I SUMMARY STATISTICS: QUARTERLY AGGREGATE DATA, 1953:2–2007:2
I/K E(inflation) yBaa r 10 (0.1 + r 10 )/(0.1 + yBaa ) Classic Tobin’s q Bond market’s q
Obs.
Mean
St. dev.
217 217 217 217 217 217 217
0.105 0.037 0.082 0.065 0.908 2.029 1.500
0.010 0.025 0.030 0.027 0.033 0.845 0.117
Min 0.082 −0.016 0.035 0.023 0.796 0.821 1.154
Max 0.125 0.113 0.170 0.148 0.974 4.989 1.720
Notes. Investment and replacement cost of capital are from NIPA. Expected inflation is from the Livingston survey. Yields on 10-year Treasuries and Moody’s Baa index are from FRED. Classic Tobin’s q is computed from the flow of funds, following Hall (2001). Bond market’s q is computed using the structural model, and its mean is normalized to 1.5.
maintain a constant relative price, but their leverage would jump from one mapping to another. I return to this issue in Section VII. V. EMPIRICAL EVIDENCE In this section, I construct a new measure of q using only data from the bond market. I then compare this measure to the usual measure of q, and I assess their respective performances in the aggregate investment equation. The data used in the calibration are summarized in Table I. All the parameters used in the calibration, and the empirical moments used to infer them, are presented in Table II. V.A. Data and Estimation of the Parameters I now describe the data used to estimate the parameters of equations (17) and (18) and the construction of q. Leverage. In the baseline case, book leverage is set to 0.45 based on Leland (2004). Using Compustat, I find a slow increase in average book leverage from 0.4 to 0.55 over the postwar period (Figure IIIa). The sample includes nonfinancial firms, with at least five years of nonmissing values for assets, stock price, operating income, debt, capital expenditures, and property, plants, and equipment. Idiosyncratic Risk. Equation (17) is estimated with firm-level data from Compustat. The profit rate is operating income divided by the net stock of property, plants, and equipment, and η is the idiosyncractic component of this profit rate. Firms in finance and
1027
THE BOND MARKET’S q TABLE II PARAMETERS OF BENCHMARK MODEL Data Parameters chosen exogenously Real risk-free rate r Curvature of adjustment cost function γ2 Average maturity 1/ Book leverage Parameters directly observed in the data Persistence of idiosyncratic profit rate ρη Volatility of idiosyncratic innovations ση Persitence of aggregate profit rate ρa Moments matched Relative bond price (mean) (0.1 + r)/(0.1 + y) Relative bond price (volatility (0.1 + r)/(0.1 + y) of detrended series) Average bond issued at par value E[b]/ f Implied parameters Average profit rate a/r Volatility of aggregate innovations σa Coupon rate c
Model
3% 10 years 10 years 0.45 0.47 0.14 0.7
0.47 0.14 0.7
0.908 0.027
0.908 0.027
1
1
0.925 0.045 0.043
real estate are excluded. The panel regression includes firm fixed effects to remove permanent differences in average profitability across firms or industries due to accounting and technological differences. The estimated baseline parameters, ρη = 0.47 and ση = 14%, are consistent with many previous studies.7 An important issue is that the idiosyncratic volatility of publicly traded companies is not constant. Campbell and Taksler (2003) show that changes in idiosyncratic risk have contributed to changes in yield spreads. The frequency of accounting data is too low to estimate quarterly changes in volatility. In addition, we need a forward-looking measure of idiosyncratic risk to capture market expectations. For all these reasons, the best measure should be based on idiosyncratic stock returns. Following the standard practice in the literature, I use a six-month moving average 7. For instance, Gomes (2001) uses a volatility of 15% and a persistence of 0.62 for the technology shocks. Hennessy, Levy, and Whited (2007) report a persistence of the profit rate of 0.51 and a volatility of 11.85%, which they match with a persistence of 0.684 and a volatility of 11.8% for the technology shocks. Note that in both of these papers, firms operate a technology with decreasing returns. Here, by contrast, the technology has constant returns to scale. This explains why some details of the calibration are different.
1028
0.4
a
0.35
0.8
0.85
0.45
0.9
0.5
0.95
0.55
1
0.6
QUARTERLY JOURNAL OF ECONOMICS
1955
1960
1965
1970
1975
1980
1985
1990
1995
2000
2005
Book leverage (right axis)
0.12
0.14
0.16
0.18
Relative bond price
b 1955
1960
1965
1970
1975
1980
Volatility (returns)
1985
1990
1995
2000
2005
Volatility (sales)
FIGURE III The Components of Bond Market q Leverage is average book leverage among nonfinancial firms in Compustat. Idiosyncratic volatility is estimated either from idiosyncratic stock returns or from the dispersion of sales growth. Both measures are then translated into the parameter ση of the model. Relative bond price is the relative price of corporate and government bonds, defined as (0.1 + r)/(0.1 + y), using Moody’s Baa and 10-year Treasury yields. (a) Bond prices and leverage; (b) two measures of idiosyncratic risk.
THE BOND MARKET’S q
1029
of the monthly cross-sectional standard deviation of individual stock returns. I scale this new measure to have a sample mean of η 14% to obtain σˆ t , a time-varying estimate of idiosyncratic risk. As a robustness check, I also consider the cross-sectional standard deviation of the growth rate of sales, measured from Compustat, as a measure of volatility that avoids using stock returns.8 The η two measures of σˆ t are presented in Figure IIIb. Aggregate Bond Prices. Moody’s Baa index, denoted ytBaa , is the main measure of the yield on risky corporate debt. Moody’s index is the equal weighted average of yields on Baa-rated bonds issued by large nonfinancial corporations.9 Following the literature, the 10-year treasury yield is used as the benchmark risk-free rate. Both rt10 and ytBaa are obtained from FRED.10 For equation (18), using annual NIPA data on corporate profits and the stock of nonresidential capital over the postwar period, I estimate ρa = 0.7. The parameters a¯ and σa cannot be calibrated with historical aggregate profit rates because they must capture risk-adjusted values, not historical ones.11 Instead, the model must be consistent with observed bond prices. Three parameters are thus not directly observed in the data: these are c (the coupon rate), a, ¯ and σa . Their values are inferred by matching empirical and simulated moments. The empirical moments are the mean and standard deviation of the price of Baa bonds 8. The dispersion of sales growth is not a perfect measure either, because permanent differences in growth rates would make dispersion positive even if there is no risk. There are other ways to define idiosyncratic risk at the firm level, but they produce similar trends. See Comin and Philippon (2005) for a comparison of various measures of firm volatility. See also Campbell et al. (2001) and Davis et al. (2006) for evidence on privately held companies. 9. To be included in the index, a bond must have a face value of at least 100 million, an initial maturity of at least 20 years, and most importantly, a liquid secondary market. Beyond these characteristics, Moody’s has some discretion on the selection of the bonds. The number of bonds included in the index varies from 75 to 100 in any given year. The main advantages of Moody’s measure are that it is available since 1919, and that it is broadly representative of the U.S. nonfinancial sector, because Baa is close to the median among rated companies. 10. Federal Reserve Economic Data: http://research.stlouisfed.org/fred2/. The issue with using the ten-year treasury bond is that it incorporates a liquidity premium relative to corporate bonds. To adjust for this, it is customary to use the LIBOR/swap rate instead of the treasury rate as a measure of risk-free rate (see Duffie and Singleton [2003] and Lando [2004]), but these rates are only available for relatively recent years. I add 30 basis points to the risk-free rate to adjust for liquidity (see Almeida and Philippon [2007] for a discussion of this issue). 11. Note that, in theory, the same applies to ρa , because persistence under the risk-neutral measure can be different from persistence under the physical measure. In practice, however, the difference for ρa is much smaller than for a¯ or σa . I therefore take the historical persistence to be a good approximation of the risk neutral persistence. Section VII shows that the model is robust to various assumptions regarding aggregate dynamics.
1030
QUARTERLY JOURNAL OF ECONOMICS
relative to Treasuries, defined as (φ + rt$ )/(φ + yt$ ), where y$ is the yield on Baa corporate bonds and r $ is the yield on government bonds. The final requirement is that the average bond be issued at par. The three parameters c, a, ¯ and σa are chosen simultaneously to match the par-value requirement and the two empirical moments. The parameters inferred from the simulated moments are c = 4.3%, a/r ¯ = 0.925, and σa = 4.5%. Expected Inflation and Real Rate. The Livingston survey is used to construct expected inflation, and the yield on the ten-year treasury to construct the ex ante real interest rate, rˆtreal . Creating qbond . The model described in Section IV constructs q from the relative price of corporate bonds, conditional on the baseline values for the risk-free rate, book leverage, and idiosyncratic risk. As I have just explained, the risk-free rate, book leverage, and idiosyncratic volatility move over time. Therefore, qbond is a function of four observed inputs: average book leverage ψˆ t , η average idiosyncratic volatility σˆ t , the ex ante real rate rˆtreal , and the relative price of corporate bonds φ + rt10 η ˆ bond real (19) . qt =F ; σˆ t ; ψt ; rˆt φ + ytBaa Figure III displays the three main components: leverage, volatility, and the relative price. In theory, the dynamics of the four inputs must be jointly specified to construct the mapping of equation (19). Quantitatively, however, it turns out that one can estimate mappings with respect to (φ + rt10 )/(φ + ytBaa ) assuming constant values for the other three parameters, as I did in Figure II. For the risk-free rate, this follows from a well-known fact in the bond pricing literature: risk-free rate dynamics plays a negliη gible role in fitting corporate spreads. For σˆ t and ψˆ t , the historical series are so persistent that there is little difference between the mapping assuming a constant value and the mapping conditional on the same value in the time-varying model.12 Classic Measure of Tobin’s q. The usual measure of Tobin’s q is constructed from the flow of funds as in Hall (2001). The usual 12. To check this, I construct an extended Markov model where all the parameters follow AR(1) processes calibrated from the data. I then create mappings conditional on each realization of the parameters and I compare them to the mappings from Figure II. I find that the discrepancies are small for volatility and invisible for book leverage and the risk-free rate. Detailed results and figures are available upon request.
1031
1
2
3
4
5
THE BOND MARKET’S q
1955
1960
1965
1970
1975
1980
Usual q
1985
1990
1995
2000
2005
Bond q
FIGURE IV Usual Measure of q and Bond Market’s q Tobin’s q is constructed from the flow of funds, as in Hall (2001). Bond q is constructed from Moody’s yield on Baa bonds, using the structural model calibrated to the observed evolutions of book leverage and firm volatility, expected inflation from the Livingston survey, and the yield on 10-year Treasury bonds.
measure is the ratio of the value of ownership claims on the firm less the book value of inventories to the reproduction cost of plant and equipment. All the details on the construction of this measure can be found in Hall (2001). Investment and Capital Stock. I use the series on private nonresidential fixed investment and the corresponding current stock of capital from the Bureau of Economic Analysis. Table I displays the summary statistics. V.B. Investment Equations Figure IV shows the two measures of q: qusual , constructed from the flow of funds as in Hall (2001), qbond constructed using bond yields, leverage, idiosyncratic volatility, expected inflation, and the theoretical mappings described in the previous sections. The average value of qbond is arbitrary, because γ1 is a free parameter, and I normalize it to 1.5. Figure IV shows that qusual is approximately seven times more volatile than qbond . The standard deviation of qusual is 0.845, whereas the standard deviation
QUARTERLY JOURNAL OF ECONOMICS
0.08
1
0.09
2
0.1
q
I /K
3
0.11
4
0.12
5
0.13
1032
1955 1960 1965 1970 1975 1980 1985 1990 1995 2000 2005
I /K
Usual q
FIGURE V Usual Measure of q and Investment Rate I/K is corporate fixed investment over the replacement cost of equipment and structure. Usual q is constructed from the flow of funds, as in Hall (2001).
of qbond is only 0.117, as reported in Table I. It is also interesting to note that qbond is approximately stationary, because the mappings take into account the evolution of idiosyncratic volatility and book leverage, as explained above. In the short run, qbond depends mostly on the relative price component. Year-to-year changes in (φ + rt10 )/(φ + ytBaa ) account for 85% of the year-to-year changes in qbond . In the long run, leverage and, especially, idiosyncratic risk are also important. Figure V shows qusual and the investment rate in structure and equipment. Figure VI shows qbond and the same investment rate. The corresponding regressions are reported in the upper panel of Table III. They are based on quarterly data. The investment rate in structure and equipment is regressed on the two measures of q, measured at the end of the previous quarter: bond usual + β e qt−1 + εt . xt = α + β bqt−1
The standard errors control for autocorrelation in the error terms up to four quarters. qbond alone accounts for almost 60% of aggregate variations in the investment rate. qusual accounts for only
1033
1.1
0.08
1.2
0.09
1.3
0.1
I /K
1.4 1.5 Bond q
0.11
1.6
0.12
1.7
1.8
0.13
THE BOND MARKET’S q
1955 1960 1965 1970 1975 1980 1985 1990 1995 2000 2005
I /K
Bond q
FIGURE VI Bond Market’s q and Investment Rate I/K is corporate fixed investment over the replacement cost of equipment and structure. Bond q is constructed from Moody’s yield on Baa bonds, using the structural model calibrated to the observed evolutions of book leverage and firm volatility, expected inflation from the Livingston survey, and the yield on 10-year Treasury bonds.
10% of aggregate variations. Moreover, once qbond is included, the standard measure has no additional explanatory power. Looking at Figure V, the fit of the investment equation is uniformly good, except in the late 1980s and early 1990s, where, even though the series remain correlated in changes (see below), there is a persistent discrepancy in levels. qbond is more correlated with the investment rate, hence the better fit of the estimated equation, but it is also less volatile than qusual . As a result, the elasticity of investment to q is almost eighteen times higher with this new measure, which is an encouraging result because the low elasticity of investment with respect to q has long been a puzzle in the academic literature. The estimated coefficient still implies adjustment costs that are too high, around 15 years, but, as Erickson and Whited (2000) point out, there are many theoretical and empirical reasons that the inverse of the estimated coefficient is likely to underestimate the true elasticity.13 13. Note that the mapping is calibrated assuming γ2 = 10, so in theory the coefficient should be 0.1. In Table III, it is 0.065. I have also solved for the model
1034
QUARTERLY JOURNAL OF ECONOMICS TABLE III BENCHMARK REGRESSIONS
Bond q (t − 1) S.e. Classic q (t − 1) S.e. Bond q (t − 1), alt. measure S.e. Observations OLS R2 [bond q] (t − 5, t − 1) S.e. [classic q] (t − 5, t − 1) S.e. [profit rate] (t − 5, t − 1) S.e. [bond q] (t − 5, t − 1), alt. measure S.e. Observations OLS R2
0.0650∗∗∗ (0.00594)
216 .574
Equation in levels: I/K(t) 0.0629∗∗∗ (0.00642) 0.00366∗∗ 0.000928 (0.00155) (0.000970)
216 .095
Estimation in changes: 0.0515∗∗∗ (0.00495) 0.00700∗∗∗ (0.00187)
212 .613
212 .102
216 .580
0.0521∗∗∗ (0.00706) 216 .432
I/K(t) − I/K(t − 4) 0.0471∗∗∗ (0.00584) 0.00240∗ (0.00133) 0.0530 (0.0514) 0.0517∗∗∗
212 .628
(0.00500) 212 .561
Notes. Fixed private nonresidential capital and investment series are from the BEA. Quarterly data, 1953:3 to 2007:2. Classic q is constructed from the flow of funds, as in Hall (2001). Bond q is constructed by applying the structural model to Corporate and Treasury yields, expected inflation, book leverage, and firm volatility measured with idiosyncratic stock returns. The alternate measure of Bond q uses idiosyncratic sales growth volatility as an input. Newey–West standard errors with autocorrelation up to four quarters are reported in parentheses. ∗ , ∗∗ , and ∗∗∗ denote statistical significance at the 10%, 5%, and 1% levels. Constant terms are omitted.
Figure VII shows the four-quarter difference in the investment rate, a measure used by Hassett and Hubbard (1997), among others, because of the high autocorrelation of the series in levels. The corresponding regressions are presented in the bottom panel of Table III. The fit of the equation in difference is even better than the fit in levels, with an R2 above 60%. In the third regression, the change in corporate cash flows over capital is added to the right-hand side of the equation, but it is insignificant and does not improve the fit of the equation. The construction of qbond uses idiosyncratic stock returns to measure firm volatility. Note that using idiosyncratic return volatility is justified even when the aggregate stock market is assuming γ = 15. This makes the theoretical and actual coefficients similar, but does not change anything to the rest of the results. See also Section VII for a discussion of biases.
1035
–0.02
–0.01
0
0.01
0.02
THE BOND MARKET’S q
1955
1960
1965
1970
1975
Change in I /K from t – 4 to t
1980 1985 time
1990
1995
2000
2005
Predicted with lagged bond q
FIGURE VII Four-Quarter Changes in Investment Rate, Actual and Predicted I/K is corporate fixed investment over the replacement cost of equipment and structure. Bond q is constructed from Moody’s yield on Baa bonds, using the structural model calibrated to the observed evolutions of book leverage and firm volatility, expected inflation from the Livingston survey, and the yield on 10-year Treasury bonds.
potentially mispriced. Mispricing across firms is limited by the possibility of arbitrage. In the aggregate, however, arbitrage is much more difficult. There is therefore no inconsistency in using the idiosyncratic component of stock returns to measure idiosyncratic risk, although acknowledging that the aggregate stock market can sometimes be over valued. Nonetheless, one might be concerned about the use of equity returns here, and I have repeated the calibration using the standard deviation of sales growth as a measure of volatility. The results, in the last column of Table III, are somewhat weaker than with the benchmark model. The reason is that sales volatility is a lagging indicator of idiosyncratic risk. Hilscher (2007) shows that the bond market is actually forward-looking for volatility. As a result, using a measure of volatility that lags the true information—and all accounting measures do— creates a specification error. This matters less for the equation in changes because of the smaller role of volatility in that equation.
1036
QUARTERLY JOURNAL OF ECONOMICS
The conclusions from this empirical section are the following: • With aggregate U.S. data, qbond fits the investment equation well, both in levels and in differences. • The estimated elasticity of investment to qbond is 18 times higher than the one estimated with qusual . • Corporate cash flows do not have significant explanatory power once qbond is included in the regression. V.C. Further Evidence The evidence presented above is based on the construction of qbond in equation (19). In this section, I provide evidence on the explanatory power of the components separately, and on the role of nonlinearities in the model. I also test the predictive power of the model. The results are in Tables IV and V. Explanatory Power of Individual Components. The econometric literature has studied the predictive power of default spreads for real economic activity.14 Table IV shows the explanatory power of the components of qbond , in levels and in four-quarter differences. Consider first the top part of Table IV, for the regressions in level. Column (1) shows that the Baa spread, by itself, has no explanatory power for investment. The explanatory power appears only when idiosyncratic volatility and leverage are also included, in Column (3). These factors have not been used in the empirical literature. Their empirical importance provides support for the theory developed in this paper. In addition, notice that even when all the components are entered linearly, the explanatory power is only 45%. By contrast, the qbond has an R2 of 57.4% with one degree of freedom instead of four. This shows that the nonlinearities are important in the level equation, as explained below. For the equation in four-quarter differences, the spread by itself has significant explanatory power. This is what one would expect, because the low-frequency movements in leverage and volatility matter less in these regressions. Nonetheless, leverage and volatility are still highly significant. The unrestricted linear 14. Bernanke (1983) notes that the spread of Baa over treasury went “from 2.5 percent during 1929–30 to nearly 8 percent in the mid-1932” and shows that the spread was a useful predictor of industrial production growth. Using monthly data from 1959 to 1988, Stock and Watson (1989) find that the spread between commercial paper and Treasury bills predicts output growth. Some of these relationships are unstable over time (see Stock and Watson [2003] for a survey).
1037
THE BOND MARKET’S q TABLE IV DECOMPOSING BOND q
−0.166 yBaa − r 10 (t − 1) S.e. (0.189) Real risk-free rate (t − 1) S.e. Idiosyncratic volatility (t − 1) S.e. Book leverage (t − 1) S.e. [0.1 + r 10 ]/[0.1 + yBaa ] (t − 1) S.e. Real discount factor (t − 1) S.e. Quadratic term (t − 1) S.e. N 216 .013 OLS R2
Equation in levels: I/K(t) −0.152 −1.051∗∗∗ (0.189) (0.177) −0.0700 −0.0781 (0.0796) (0.0744) 0.278∗∗∗ 0.424∗∗∗
0.407∗∗∗
(0.0759) (0.0672) (0.0690) 0.0910∗∗∗ 0.0633∗∗∗ 0.0727∗∗∗ (0.0214) (0.0172) (0.0159) 0.252∗∗∗ 0.263∗∗∗ (0.0268) 0.203∗∗∗ (0.0673)
216 .023
216 .451
216 .582
Estimation in changes: I/K(t) − I/K(t − 4) [yBaa − r 10 ](t − 5, t − 1) −0.942∗∗∗ −0.953∗∗∗ −0.997∗∗∗ S.e. (0.103) (0.108) (0.0910) [real riskfree rate] −0.0237 −0.00297 (t − 5, t − 1) S.e. (0.0355) (0.0338) 0.265∗∗∗ [idiosyncratic 0.283∗∗∗ volatility] (t − 5, t − 1) S.e. (0.0885) (0.0896) [book leverage] 0.172∗∗∗ 0.142∗∗∗ (t − 5, t − 1) S.e. (0.0516) (0.0515) [(0.1 + r 10 )/ 0.199∗∗∗ (0.1 + yBaa )](t − 5, t − 1) S.e. (0.0195) [real discount factor] 0.0787∗ (t − 5, t − 1) S.e. (0.0419) [quadratic term] (t − 5, t − 1) S.e. Observations 212 212 212 212 .478 .479 .618 .618 OLS R2
(0.0291) 0.187∗∗∗ (0.0649) 1.069∗∗ (0.522) 216 .604
0.270∗∗∗ (0.0912) 0.133∗∗∗ (0.0502) 0.201∗∗∗ (0.0195) 0.0765∗ (0.0412) 0.398 (0.293) 212 .622
Notes. Fixed private nonresidential capital and investment series are from the BEA. Quarterly data, 1953:3 to 2007:2. The ex ante real rate is the nominal rate minus expected inflation from the Livingston survey. The real discount factor is (1 + E[inflation])/(1+nominal rate). Firm volatility is measured with idiosyncratic stock returns. The nominal rates are r 10 for 10-year Treasury bonds, and yBaa for Moody’s index of Baa bonds. Quadratic term is the square of the relative price of Baa bonds minus its mean: [(0.1 + r 10 )/(0.1 + yBaa ) − 0.9]2 . Newey–West standard errors with autocorrelation up to 4 quarters are reported in parentheses. ∗ , ∗∗ , and ∗∗∗ denote statistical significance at the 10%, 5% and 1% levels. Constant terms are omitted.
0.396∗∗∗ (0.0778) 215 .398
0.148∗∗∗ (0.0231)
0.337∗∗∗ (0.0832) 215 .341
0.710∗∗∗ (0.182)
0.131 (0.0935) 215 .430
0.160∗∗∗ (0.0251) 0.760∗∗∗ (0.199)
0.161∗∗∗ (0.0247) 0.797∗∗∗ (0.200) 0.0130 (0.00955) 0.0996 (0.0928) 215 .443 0.0119 (0.00917) 0.169∗∗∗ (0.0585) 0.00976∗∗∗ (0.00279) 0.0206 (0.0832) 215 .149
Growth rate of consumption
0.0279 (0.0417) −0.162 (0.346) 0.0401∗∗ (0.0184) 0.511∗∗∗ (0.0543) 215 .335
Growth rate of residential investment
Notes. Maximum likelihood estimation of coefficients and standard errors, assuming AR(1) model. The R2 is for the corresponding OLS regression with lagged dependent variable on the right-hand side. Fixed private nonresidential investment series is from the BEA. Quarterly data, 1953:3 to 2007:2. Bond q is constructed by applying the structural model to Corporate and Treasury yields, expected inflation, book leverage, and firm volatility measured with idiosyncratic stock returns. Constant terms are omitted.
[bond q] (t − 1) S.e. log[real GDP] (t − 1) S.e. [classic Q] (t − 1) S.e. AR(1) S.e. Observations R2 of OLS
Growth rate of private nonresidential fixed investment
TABLE V PREDICTIVE REGRESSIONS, ONE QUARTER AHEAD: MAXIMUM LIKELIHOOD ESTIMATION OF AUTOREGRESSIVE MODEL
1038 QUARTERLY JOURNAL OF ECONOMICS
THE BOND MARKET’S q
1039
model has an R2 of 61.6%, compared to 61.3% for the bond q model. This suggests that, also as expected, nonlinear effects are not crucial for the specification in changes. Nonlinear Effects. There are several nonlinear effects in the model. Consider equation (4): Tobin’s q has two components, the real discount factor and the expected risk-neutral value of capital, Eπ [v(ω )|ω]. This letter item is a function of the relative price of corporate bonds, as shown in Figures I and II. Thus, the model suggests the use of the relative price (φ + rt10 )/(φ + ytBaa ) instead of the spread ytBaa − rt10 . When rates are stable, the difference between the spread and the relative price is negligible. In the data, however, the level of nominal rates changes a lot. A given change in the spread has a larger impact on the relative price when rates are low than when they are high. Column (4) provides strong support for this first nonlinearity. The relative price does much better than the spread in the level regression.15 The R2 increases from 45.1% to 58.2% because of the nonlinear correction. A second nonlinearity comes from the mapping of Figure I. Tobin’s q is a convex function of the relative bond price. Column (5) shows that this effect is significant, but it only increases the R2 by 2 percentage points. The last column of Table IV can also be compared to the first column of Table III. In level, the structural model has a fit of 57.4%. The unrestricted nonlinear model has a fit of 60.4%. In a statistical sense, the difference is significant, but in an economic sense, it does not appear very important. In differences, the respective performances are 61.3% and 62.2%. These results support the restrictions imposed by the theory. Predictive Regressions. Table V reports the results from predictive regressions of the growth rate of three macroeconomic variables: real corporate investment, real consumption expenditures, and real residential investment. In each case, I run two separate regressions. I estimate an AR(1) model by maximum likelihood to obtain the correct coefficients and standard errors. I also run an OLS regression with the lagged dependent variable on the RHS to get a sense of the R2 of the simple linear regression. 15. Note that, in theory, this could also apply to the real discount factor: (1 + E[inflation])/(1 + r $ ) is not the same as E[inflation] − r $ when nominal shocks are large. Empirically, this nonlinearity seems to matter much less, probably because the real rate is not as volatile as the Baa yield.
1040
QUARTERLY JOURNAL OF ECONOMICS
The first column shows that qbond is a very significant predictor of corporate investment growth. It predicts better than the “accelerator” model based on lagged output growth (column (2)). While lagged output growth still has significant marginal forecasting power, it increases the R2 by only 3 percentage points (column (3)). In addition, the coefficient on qbond actually goes up. Column (4) shows that qusual has no predictive power for corporate investment.16 The last two columns focus on consumption and residential investment. Although qbond is the best predictor of corporate investment, it does not predict housing or consumption. qusual , on the other hand, does not predict corporate investment, but it does predict housing and (to some extent) consumption. These results are suggestive of wealth effects from the equity market. They are consistent with the results of Hassett and Hubbard (1997) but clearly inconsistent with the usual implementation of the q-theory. The conclusions from this empirical section are the following: • All the components identified by the theory (bond spreads, volatility, leverage, risk-free rate) are statistically and economically significant. • The fit of the restricted structural model is almost as good as the fit of the unrestricted regressions. • The nonlinearities of the model (relative price instead of spreads, convexity of mapping) are important for the level regressions. • The bond market predicts future corporate investment well, whereas the equity market has no marginal predictive power. VI. THEORETICAL EXPLANATIONS The results so far show that it is possible to link corporate investment and asset prices, using the corporate bond market and modern asset pricing theory. They do not explain why the usual approach fails, however. This section sheds some light on this complex question. 16. Fama (1981) shows that stock prices have little forecasting power for output. Cochrane (1996) finds a significant correlation between stock returns and the growth rate of the aggregate capital stock, but Hassett and Hubbard (1997) argue that it is driven by the correlation with residential investment, not corporate investment. In any case, I find that the bond market’s q outperforms the usual measure both in differences and in levels.
THE BOND MARKET’S q
1041
It is important to recognize that a satisfactory explanation must address two related but distinct issues: 1. Why is qusual more volatile than qbond ? 2. Why does qbond fit the investment equation better? I consider two explanations.17 The first explanation is based on growth options and the distinction between average and marginal q. The second explanation is based on mispricing in the equity market. I chose these explanations because they provide useful benchmarks. They are not mutually exclusive, and they are not the only possible explanations. VI.A. Growth Option Interpretation Suppose that, in addition to the value process in equation (3), the firm also has a growth option of value Gt . Total firm value is then Vt = vt kt−1 + Gt .
(20)
Consider for simplicity the example of Section III, with short-term debt and a constant risk-free rate. The value of short-term debt is (21)
Bt =
1 Eπ [min( t ; vt+1 kt + Gt+1 )]. 1+r t
Let Gt be a binary variable. Gt = GH , with risk-neutral probability λt−1 and GL otherwise. The following proposition states that a growth option with enough skewness can explain why qbond fits better than qusual . PROPOSITION 3. Consider the model of equations (20) and (21). By choosing λt and GL small enough, and GH large enough, the fit of the investment equation can be arbitrarily good for qbond , and arbitrarily poor for qusual . Proof. See the Appendix. The intuition behind Proposition 3 is straightforward. A small probability of a large positive shock has a large impact on equity prices, and almost no impact on bond prices. Because growth options do not depend on the capital stock, news about the likelihood of these future shocks does not affect investment. In essence, 17. For a investigation of whether the same pricing kernel can price bonds and stocks, see Chen, Collin-Dufresne, and Goldstein (forthcoming).
1042
QUARTERLY JOURNAL OF ECONOMICS
growth options drive wedges between bond and equity prices, and between marginal and average q. What are the possible interpretations of these shocks? The simplest one is that firms earn organizational rents. Think of a large industrial corporation with outstanding organizational capital. This firm will be able to seize new opportunities if and when they arrive. This might happen through mergers and acquisitions or through internal development of new lines of business. Investing more in the current business and current technology does not improve this option value.18 To summarize, the rational interpretation proposes the following answers to the two questions posed at the beginning of this section: 1. Why is qusual more volatile than qbond ? Because growth options affect stocks much more than bonds. 2. Why does qbond fit the investment equation better? Because growth options are unrelated to current capital expenditures. The example given is obviously extreme, but the lesson is a general one. It is not difficult to come up with a story where current capital expenditures are well explained by the bond market, whereas firm creation, IPOs, and perhaps R&D, are better explained by the equity market. A complete understanding of these joint dynamics is an important topic for future research. VI.B. Mispricing Interpretation Stein (1996) analyzes capital budgeting in the presence of systematic pricing errors by investors, assuming that managers have rational expectations. He emphasizes three crucial aspects of capital budgeting in such a world: (i) the true NPV of investment, (ii) the gains from trading mispriced securities, and (iii) the costs of deviating from an optimal capital structure in order to achieve (i) and (ii). For the purpose of my paper, the most important result is that when capital structure is not a constraint, and when managers have long horizons, real investment decisions are not influenced by mispricing (Stein 1996, Proposition 3). 18. Some other expenditures could be complement with the option value. These could include R&D and reorganizations. At the aggregate level, one might think that new options were realized by new firms. This would explain why IPOs are correlated with the equity market (Jovanovic and Rousseau 2001). For a model of growth option at the firm level, see Abel and Eberly (2005).
THE BOND MARKET’S q
1043
Gilchrist, Himmelberg, and Huberman (2005) consider a model where mispricing comes from heterogeneous beliefs and short sales constraints. They show that increases in dispersion of investor opinion cause stock prices to rise above their fundamental values. This leads to an increase in q, share issues, and real investment. The main difference from Stein (1996) is that they assume that investors do not overvalue cash held in the firm. This assumption rules out the separation of real and financial decisions: managers who seek to exploit mispricing must alter their investment decisions and Proposition 3 in Stein (1996) does not hold. However, Gilchrist, Himmelberg, and Huberman (2005) show that even large pricing errors need not have large effects on investment. Thus, it is possible to explain the fact that investment does not react much to equity mispricing, even when the strict dichotomy of Stein (1996)’s Proposition 3 fails. Neither Stein (1996) nor Gilchrist, Himmelberg, and Huberman (2005) consider the role of bonds and stocks separately, so it appears that the story is still incomplete. It turns out, however, that recent work in behavioral finance has shown that skewed assets are more likely to be mispriced (Barberis and Huang 2007; Brunnermeier, Gollier, and Parker 2007; Mitton and Vorkink 2007). A direct implication is that mispricing is more likely to appear in the equity market than in the bond market. Of course, mispricing can also happen in the bond market. Piazzesi and Schneider (2006), for instance, analyze the consequences for asset prices of disagreement about inflation expectations. To summarize, the behavioral interpretation proposes the following answers to the two questions posed at the beginning of this section: 1. Why is qusual more volatile than qbond ? Because mispricing is more likely in the equity market than in the bond market. 2. Why does qbond fit the investment equation better? Because managers do not react (much) to mispricing. The growth option and mispricing interpretations are not mutually exclusive. In fact, the term Gt in equation (20) is the most likely to be mispriced. The rational and behavioral explanations simply rely on different critical assumptions. In the rational case, Gt must not depend on k, otherwise investment would respond. In the behavioral story, it is important that managers have long horizons.
1044
QUARTERLY JOURNAL OF ECONOMICS
VII. THEORETICAL ROBUSTNESS The beauty of the standard q theory is its parsimony. Beyond the assumptions of constant returns and convex costs, it is extremely versatile. In equation (4), the sources of variations in q(ω) include changes in the term structure of risk-free rates, cash flow news that has aggregate, industry, and firm components, and changes in risk premia that separate the market value Eπ [v ] from the objective expectation E[v ]. These multidimensional shocks can be combined in arbitrary ways, and yet their joint impact on investment can be summarized by one real number. Unfortunately, the standard approach fails. The previous section has presented two explanations for this failure, as well as for the (relative) success of the new approach. The new approach, however, is not as model-free as the standard approach. The mappings of Figures I and II are constructed under specific assumptions regarding firm and aggregate dynamics. The goal of this section is to study the theoretical robustness of the new approach. To do so, I focus on three issues: • Is there an exact mapping at the firm level, similar to the one in Figure I, for aggregate q? The answer turns out to be no, but qbond is still a useful measure. • Suppose that aggregate dynamics does not follow a simple autoregressive process under the risk neutral measure. Would the misspecified mapping of Figure I still deliver a good fit? Yes. • What happens when the Modigliani–Miller assumptions do not hold? If anything, the model seems to work better in this case. VII.A. Firm-Level Mappings I first study the extent to which the aggregate mapping of Figure II applies at the firm level. Figure VIII and the left part of Table VI report the results based on a simulated panel of fifty years and 100 firms, using the benchmark model with the parameters in Table II. To get an exact mapping, there must be a monotonic relationship between asset value and bond prices. This is typically the case when there is only one dimension of heterogeneity. In the top left panel of Table VI, the R2 for the aggregate regression is 1 and the estimated elasticity is exactly equal to 1/γ2 (0.1, because γ2 is calibrated to 10 years). At the firm level, there are two sources of
1045
0
0.5
Tobin's q 1
1.5
2
THE BOND MARKET’S q
0.6
0.7 0.8 Relative price of corporate bonds
0.9
1
FIGURE VIII Simulations of a Panel of Firms 5,000 firm–year observations of relative bond price and Tobin’s q. Simulation with constant real rate of 3%, constant idiosyncratic volatility, and constant book leverage.
variation, aggregate and idiosyncratic. Conditional on one shock, there is an exact mapping,19 but there is no guarantee that the ranking would be preserved across several types of shocks. In fact, Figure VIII shows that they are not.20 The bottom left panel of Table VI shows that R2 for firm-level regressions is less than one. In the univariate regression, this does not bias the point estimate. In the multivariate regression, firm-level cash flows are significant, R2 increases, but the point estimate of qbond becomes unreliable. The conclusion is that, at the firm level, the bond q should be significant but cash flows are likely to remain significant as well. These predictions are consistent with the results obtained 19. For instance, fix the aggregate state, and look at the cross section. Then firms with good earnings shocks have high value and high bond prices. Or fix the firm-level shock, and then states with high values have high bond prices. 20. The intuition is the following. Suppose a firm–year observation has a true q of 1.1 based on a good firm shock in a bad aggregate state. Suppose another firm– year observation has a true q of 1.1 based on a medium firm shock in a medium aggregate state. There is no reason to expect them to have the same bond price (for instance, because persistence and volatility are not the same for idiosyncratic and aggregate shocks). As a result, the same value of q for one firm–year observation is associated with several relative prices of bonds. This is what Figure VIII shows.
(0.000148) −0.000318 (0.000312) 100 1.000
(0.000638) 0.0482∗∗∗ (0.000649) 5,000 .943
0.0993∗∗∗ (0.000619) 5,000 .838
(0.000525)
Benchmark model 0.0610∗∗∗
5,000 .879
100 .994
0.0799∗∗∗ (0.000623)
5,000 .883
0.0791∗∗∗ (0.000408)
0.0898∗∗∗ (0.00173) 5,000 .349
0.0703∗∗∗ (0.000276) 0.0434∗∗∗ (0.000498) 5,000 .953
Model B: binary aggregate cash flows
0.0394∗ (0.0222) 100 .031
0.0796∗∗∗ (0.000648) 0.00176 (0.00158) 100 .995
Model B: binary aggregate cash flows
Firm-level regressions: Dependent variable is firm I/K
0.210∗∗∗ (0.000159) 100 .98
0.100∗∗∗
0.1000∗∗∗
100 1.000
(0.00000111)
Benchmark model
Notes. Simulated annual data. The benchmark model is the one used in the main part of the paper, calibrated in Table II. In model B, the aggregate component of the profit rate is either high or low, and the probability of the high state follows a Markov chain under the risk-neutral measure. In both cases, the risk-free rate is constant at 3%, book leverage is constant at 0.45, and idiosyncratic shocks follow the benchmark model. Bond q is constructed using the benchmark mapping of Figure I (it is thus deliberately misspecified for model B). The simulation is for fifty firms and 100 years. OLS standard errors are in parentheses.
Bond q S.e. Profit rate S.e. Observations R2
Bond q S.e. Profit rate S.e. Observations R2
0.100∗∗∗
Aggregate regressions: Dependent variable is aggregate I/K
TABLE VI INVESTMENT REGRESSIONS IN SIMULATED DATA
1046 QUARTERLY JOURNAL OF ECONOMICS
THE BOND MARKET’S q
1047
by Gilchrist and Zakrajsek (2007) with a large panel data set of firm-level bond prices. They regress the investment rate on a firm-specific measure of the cost of capital, based on firm-level bond yields and industry-specific prices for capital. They find a strong negative relationship between the investment rate and the corporate yields, and they also find that qusual and cash flows remain significant. There are other explanations for the discrepancy between micro and macro results. Returns to scale might be decreasing at the level of an individual firm, even though they are constant for the economy as a whole. This could explain why cash flows are significant in the micro data but not in the macro data. Finally, to the extent that mispricing explains some of the discrepancy between qusual and qbond , the results are consistent with the argument in Lamont and Stein (2006) that there is more mispricing at the aggregate level than at the firm level. VII.B. Robustness to Model Misspecifications I now turn to the issue of the specification of aggregate dynamics. The mapping in Figure I assumes that aggregate dynamics follow an AR(1) process. This is a restrictive assumption, especially under the risk-neutral measure.21 A second model is therefore used to check the robustness of the results. Model B (described in the Appendix) is meant to be the polar opposite to the benchmark model as far as aggregate dynamics are concerned (idiosyncratic shocks are unchanged). This model captures two important ideas. First, cash flows might have a short- and a long-run component, the long-run one being more relevant for valuation and investment. Second, holding constant the objective distribution of cash flows, changes in the market price of risk (due to changes in risk aversion or conditional volatilities) affect the risk-neutral likelihood of good and bad states. In both cases, aggregate cash flows would not summarize the aggregate state. This model is empirically relevant because Vuolteenaho (2002) shows that much of the volatility at the firm level reflects cash-flow news, whereas discount rate shocks are much more important in the aggregate. I simulate a panel similar to the one discussed above (fifty years, 100 firms). I then use the mapping of Figure I to construct 21. Moreover, this assumption implies that the current aggregate profit rate is a sufficient statistic for the current aggregate state, which is clearly unrealistic. This can be seen in the simulated aggregate regressions where aggregate cash flows have an R2 of .98 (Table VI, column (2)).
1048
QUARTERLY JOURNAL OF ECONOMICS
q. The model is therefore misspecified because the benchmark mapping is used to construct qbond in a world where aggregate dynamics are substantially different from the benchmark. The first result in Table VI is that the model retains most of its explanatory power. qbond is a reliable predictor even when the model is misspecified. The R2 is still close to one, at 99.4%. The only issue is the bias in the estimated coefficient, which overestimates adjustment costs by about 20%. The second important message of Table VI is that cash flows are not reliable in the aggregate regression. In model B, cash flows have no explanatory power for aggregate investment. At the firm level, cash flows remain significant, as expected, because firm-level dynamics is the same as in the benchmark model. VII.C. Bankruptcy Costs and Leverage The benchmark model is built under the assumptions of no taxes and no bankruptcy costs (Modigliani and Miller 1958).22 I now study how qbond performs if there are taxes and bankruptcy costs. To focus on the crucial issues and to avoid heavy notations, I consider here a simple one-period example. Investment takes place at the beginning of the period, and returns are realized at the end. The risk-free rate is normalized to zero. Profits are taxed at a flat rate, payments to bondholders are deductible, and there are bankruptcy costs. The details of the model are described in the Appendix. Figure IX shows the mappings for different values of distress costs (in the range of values consistent with empirical estimates). Distress costs do not appear to invalidate the approach taken in this paper. The shape of the mapping is similar across the various models.23 If anything, higher bankruptcy costs make the mapping from relative bond prices to q more linear, and thus easier to estimate empirically. VIII. CONCLUSION This paper has shown that it is possible to construct Tobin’s q using bond prices, by bringing the insights of Black and Scholes 22. In such a world, capital structure is irrelevant, and arbitrary changes in leverage are possible without affecting investment. This issue was discussed at the end of Section IV. 23. The fact that one mapping is higher than another on average is irrelevant because it relates only to the average value of q.
THE BOND MARKET’S q
1049
1.6 No distress costs Moderate distress costs Large distress costs
1.5 1.4 1.3
Tobin q
1.2 1.1 1 0.9 0.8 0.7 0.6 0.7
0.75
0.8 0.85 0.9 Relative price of corporate bonds
0.95
FIGURE IX Mappings with Taxes and Bankruptcy Costs Computations for the simple one-period model described in the Appendix, assuming that asset values are lognormally distributed. Moderate distress costs are consistent with the estimates of Andrade and Kaplan (1998).
(1973) and Merton (1974) to the investment models of Abel (1979) and Hayashi (1982). The bond market’s q performs much better than the usual measure of q when used to fit the investment equation using postwar U.S. data. The explanatory power is good (both in level and in differences), cash flows are no longer significant, and the inferred adjustment costs are almost twenty times smaller. Two interpretations of these results are possible. The first is that the equity market is subject to severe mispricing, whereas the bond market is not, or at least not as much. This interpretation is consistent with the arguments in Shiller (2000) and the work of Stein (1996), Gilchrist, Himmelberg, and Huberman (2005), Barberis and Huang (2007), and Brunnermeier, Gollier, and Parker (2007). Another interpretation is that the stock market is mostly right, but that it measures something other than the value of the existing stock of physical capital. This is the view pushed by Hall (2001) and McGrattan and Prescott (2007). According to this
1050
QUARTERLY JOURNAL OF ECONOMICS
view, firms accumulate and decumulate large stocks of intangible capital. If the payoffs from intangible capital were highly skewed, then they could affect equity prices more than bond prices, and this could explain the results presented in this paper. The difficulty of this theory, of course, is that it rests on a stock of intangible capital that we cannot readily measure (see Atkeson and Kehoe [2005] for a plant-level analysis). Looking back at Figure IV, it is difficult to imagine a satisfactory answer that does not mix the two theories. Moreover, these theories are not as contradictory as they appear, because the fact that intangible capital is hard to measure increases the scope for disagreement and mispricing. One can hope that future research will be able to reconcile the two explanations. APPENDIX A. Proof of Proposition 1 Let θτ be the marginal default rate during period τ . Let t,τ be the cumulative default rate in periods t + 1 up to τ − 1. In other words, if a bond has not defaulted at time t, the probability that it enters time τ > t is 1 − t,τ . Thus, by definition, t,t+1 = 0 and the default rates
satisfy the recursive structure: 1 − t,τ = (1 − θt+1 ) 1 − t+1,τ . The value at the end of period t of one unit of outstanding principal is
bt1
=
Etπ
∞
(1 − t,τ )
τ =t+1
(22)
+ θτ Vτ / τ −1 ) .
(1 − φ)τ −t−1 ((1 − θτ )(c + φ) (1 + rt,τ )τ −t
Similarly, and just to be clear, the price of one unit of principal at the end of t + 1 is
1 π bt+1 = Et+1
∞
(1 − t+1,τ )
τ =t+2
(23)
+ θτ Vτ / τ −1 ) .
(1 − φ)τ −t−2 ((1 − θτ )(c + φ) (1 + rt+1,τ )τ −t−1
THE BOND MARKET’S q
1051
Using the recursive structure of and the law of iterated expectations, we can substitute (23) into (22) and obtain bt1 = (24)
1 Eπ [(1 − θt+1 )(c + φ) + θt+1 Vt+1 / t ] 1 + rt t 1 − φ π 1 + Et (1 − θt+1 )bt+1 . 1 + rt
Default happens when equity value reaches zero, that is, when
Vt < t−1 φ + c + (1 − φ)bt1 . Therefore, the pricing function satisfies (25)
bt1 =
1 1 Etπ min φ + c + (1 − φ)bt+1 ; Vt+1 / t . 1 + rt
Now recall that bt1 is the price of one unit of outstanding capital. Let us define bt as the value of bonds outstanding at the end of time t, scaled by end-of-period physical assets, bt ≡ ψbt1 ,
(26)
where book leverage was defined in the main text as ψ ≡ t /kt , and assumed to be constant. Multiplying both sides of (25) by ψ, we obtain bt =
1 Eπ [min{(φ + c)ψ + (1 − φ)bt+1 ; vt+1 }]. 1 + rt t
In recursive form, and with constant book leverage, this leads to equation (7). Note that if book leverage were state-contingent, the first term in the min function would simply be (φ + c)ψt + (1 − φ)bt+1 ψt . ψt+1 B. Proof of Proposition 3 Assume that G H > . We can then write the debt pricing formula (21) as (27) Bt =
1 (1 − λt )Etπ min t ; vt+1 kt + GL + λt t . 1+r
1052
QUARTERLY JOURNAL OF ECONOMICS
Taking the limit in equation (27) as λt → 0 and GL → 0, it is clear that Bt =
1 Eπ [min( t ; vt+1 kt )]. 1+r t
This is the same pricing formula we used earlier, and we have already seen that one can construct a sufficient statistic for investment in this case. On the other hand, the market value of equity moves when it is revealed that Gt = G H . It is always possible to increase the variance of the shocks by increasing G H . Because these shocks are uncorrelated with investment, the explanatory power of the traditional measure can become arbitrarily small. Note that in a growing economy, it would make sense to index the growth option on aggregate TFP to obtain a model with a balanced growth path. C. Model B In this model, the conditional distribution of cash flows follows a Markov process. Aggregate cash flows can be either high, aH , or low, aL, and the risk-neutral probability of observing a high cash flow is state-dependent: Pr(a = aH ) = f (s). State s follows a four-state Markov process under the risk-neutral measure. The complete aggregate state is (s, a). There are therefore eight possible aggregate states: four states for s and two for a. The persistence in the aggregate time series comes from the persistence in s. Conditional on s, aggregate cash flows are i.i.d. The transition matrix of s is chosen to match the empirical moments in Table II. Firm-level dynamics η are given by equation (17) as in the benchmark model. D. Distress Costs This is a one-period model. Without taxes or bankruptcy costs, the program of the firm is (28)
max Eπ [vk] − k − γ k2 /2, k
where k is investment and v is a random variable. Optimal investment is (29)
k = (q − 1)/γ ,
THE BOND MARKET’S q
1053
where q ≡ Eπ [v]. Now assume that profits are taxed at rate τ , that payments to bondholders are deductible, and that there are proportional bankruptcy costs ϕ. In case of default, a value ϕvk is lost. The firm is financed with debt and equity, and let ψ be book leverage. It is then straightforward to see that (29) still holds, but the definition of q must be adjusted to (30)
q ≡ (1 − τ )Eπ [v] + τ Eπ [min(ψ, v)] − ϕ Eπ [v1v<ψ ].
The first term is the unlevered q. The second term captures the tax benefits of debt. The last term captures bankruptcy costs. The value of debt (relative to book assets) is b = Eπ [min(ψ, v)] − ϕ Eπ [v1v<ψ ].
(31)
Equity is e = E[(1 − τ )(v − ψ)1v>ψ ]. Finally, optimal leverage solves the program ∞ ψ max τ (32) min(ψ, v) dH(v) − ϕ vdH(v), ψ
0
0
where H(.) is the cumulative distribution of v, and h(.) the associated density. The first term measures the tax benefits of debt, whereas the second term measures the deadweight losses from financial distress. The first-order condition for optimal leverage is ∞ (33) dH(v) = ϕψh(ψ). τ ψ
I assume that v is lognormally distributed with volatility 0.75. In the benchmark case, I use ϕ = 0.2 and a lognormal mean of −0.2. These parameters yield values consistent with the evidence in Andrade and Kaplan (1998) and Almeida and Philippon (2007). Andrade and Kaplan (1998) estimate losses around 10%–15% of firm value one-year before bankruptcy. The parameter ϕ applies to ex post losses, and these happen when v turns out to be low. A value of 20% implies that the deadweight losses relative to initial firm value are around 10%. To be consistent with a book leverage of 0.5, equation (33) implies that τ = 11.5%. This is consistent with Graham (2000). The benchmark case is therefore chosen so that leverage is optimal (on average, not state by state) and distress costs are consistent with empirical estimates. To create Figure IX, I simulate the model with different values of the mean of the lognormal
1054
QUARTERLY JOURNAL OF ECONOMICS
distribution, from −0.7 to +0.3. Finally, I repeat the exercise for each value of the distress cost parameter: ϕ ∈ {0, 0.2, 0.4}. STERN SCHOOL OF BUSINESS, NEW YORK UNIVERSITY, NBER, AND CEPR
REFERENCES Abel, A. B., Investment and the Value of Capital (New York: Garland Publishing, 1979). Abel, A. B., and O. J. Blanchard, “The Present Value of Profits and Cyclical Movements in Investment,” Econometrica, 54 (1986), 249–273. Abel, A. B., and J. C. Eberly, “Investment, Valuation, and Growth Options,” Working Paper, Wharton, 2005. Almeida, H., and T. Philippon, “The Risk-Adjusted Cost of Financial Distress,” Journal of Finance, 62 (2007), 2557–2586. Andrade, G., and S. Kaplan, “How Costly Is Financial (Not Economic) Distress? Evidence from Highly Leveraged Transactions That Became Distressed,” Journal of Finance, 53 (1998), 1443–1493. Atkeson, A., and P. J. Kehoe, “Modeling and Measuring Organization Capital,” Journal of Political Economy, 113 (2005), 1026–1053. Bachmann, R., R. J. Caballero, and E. M. Engel, “Lumpy Investment in Dynamic General Equilibrium,” Cowles Foundation Discussion Paper 1566, 2006. Barberis, N., and M. Huang, “Stocks as Lotteries: The Implications of Probability Weighting for Security Prices,” Working Paper, Yale University, 2007. Bernanke, B., and M. Gertler, “Agency Costs, Net Worth and Business Fluctuations,” American Economic Review, 79 (1989), 14–31. Bernanke, B. S., “Nonmonetary Effects of the Financial Crisis in the Propagation of the Great Depression,” American Economic Review, 73 (1983), 257–276. Berndt, A., R. Douglas, D. Duffie, M. Ferguson, and D. Schranz, “Measuring Default-Risk Premia from Default Swap Rates and EDFs,” Working Paper, Stanford GSB, 2005. Black, F., and M. Scholes, “The Pricing of Options and Corporate Liabilities,” Journal of Political Economy, 81 (1973), 637–654. Brunnermeier, M. K., C. Gollier, and J. A. Parker, “Optimal Beliefs, Asset Prices, and the Preference for Skewed Returns,” American Economic Review Papers and Proceedings, 97 (2007), 159–165. Caballero, R., “Aggregate Investment,” in Handbook of Macroeconomics, J. B. Taylor and M. Woodford, eds., vol. 1B (Amsterdam: Elsevier Science, North Holland, 1999). Caballero, R., and E. Engle, “Explaining Investment Dynamics in U.S. Manufacturing: A Generalized (S, s) Approach,” Econometrica, 67 (1999), 783– 826. Campbell, J. Y., M. Lettau, B. Malkiel, and Y. Xu, “Have Individual Stocks Become More Volatile? An Empirical Exploration of Idiosyncratic Risk,” Journal of Finance, 56 (2001), 1–43. Campbell, J. Y., and G. B. Taksler, “Equity Volatility and Corporate Bond Yields,” Journal of Finance, 58 (2003), 2321–2349. Chen, L., P. Collin-Dufresne, and R. S. Goldstein, “On the Relation between the Credit Spread Puzzle and the Equity Premium Puzzle,” Review of Financial Studies (forthcoming). Cochrane, J. H., “A Cross-Sectional Test of an Investment-Based Asset Pricing Model,” Journal of Political Economy, 104 (1996), 572–621. Comin, D., and T. Philippon, “The Rise in Firm-Level Volatility: Causes and Consequences,” in Macroeconomics Annual, M. Gertler and K. Rogoff, eds. (Cambridge, MA: NBER, 2005). Cumins, J. G., K. A. Hasset, and S. D. Oliner, “Investment Behavior, Observable Expectations, and Internal Funds,” American Economic Review, 96 (2006), 796– 810.
THE BOND MARKET’S q
1055
Davis, S., J. Haltiwanger, R. Jarmin, and J. Miranda, “Volatility and Dispersion in Business Growth Rates: Publicly Traded versus Privately Held Firms,” in Macroeconomics Annual, D. Acemoglu, K. Rogoff, and M. Woodford, eds. (Cambridge, MA: NBER, 2006). Dixit, A. K., and R. S. Pindyck, Investment under Uncertainty (Princeton, NJ: Princeton University Press, 1994). Duffie, D., and K. J. Singleton, Credit Risk (Princeton, NJ: Princeton University Press, 2003). Erickson, T., and T. M. Whited, “Measurement Error and the Relationship between Investment and Q,” Journal of Political Economy, 108 (2000), 1027–1057. ——, “On the Accuracy of Different Measures of Q,” Financial Management, 35 (2006), 5–33. Fama, E. F., “Stock Returns, Real Activity, Inflation, and Money,” American Economic Review, 71 (1981), 545–565. Fazzari, S. M., R. G. Hubbard, and B. C. Petersen, “Financing Constraints and Corporate Investment,” Brookings Papers on Economic Activity, 1 (1988), 141– 195. Gilchrist, S., and C. P. Himmelberg, “Evidence on the Role of Cash Flow for Investment,” Journal of Monetary Economics, 36 (1995), 541–572. Gilchrist, S., C. P. Himmelberg, and G. Huberman, “Do Stock Price Bubbles Influence Corporate Investment?” Journal of Monetary Economics, 52 (2005), 805– 827. Gilchrist, S., and E. Zakrajsek, “Investment and the Cost of Capital: New Evidence from the Corporate Bond Market,” NBER Working Paper No. 13174, 2007. Gomes, J. F., “Financing Investment,” American Economic Review, 91 (2001), 1263– 1285. Graham, J. R., “How Big Are the Tax Benefits of Debt?” Journal of Finance, 55 (2000), 1901–1941. Hackbarth, D., J. Miao, and E. Morellec, “Capital Structure, Credit Risk, and Macroeconomic Conditions,” Journal of Financial Economics, 82 (2006), 519– 550. Hall, R. E., “The Stock Market and Capital Accumulation,” American Economic Review, 91 (2001), 1185–1202. ——, “Corporate Earnings Track the Competitive Benchmark,” NBER Working Paper 10150, 2003. ——, “Measuring Factor Adjustment Costs,” Quarterly Journal of Economics, 119 (2004), 899–927. Hassett, K. A., and R. G. Hubbard, “Tax Policy and Investment,” in Fiscal Policy: Lessons from the Literature, A. Auerbach, ed. (Cambridge, MA: MIT Press, 1997). Hayashi, F., “Tobin’s Marginal q and Average q: A Neoclassical Interpretation,” Econometrica, 50 (1982), 213–224. Hennessy, C. A., A. Levy, and T. M. Whited, “Testing Q Theory with Financing Frictions,” Journal of Financial Economics, 83 (2007), 691–717. Hilscher, J., “Is the Corporate Bond Market Foward Looking?” Working Paper, Brandeis University, 2007. Jorgenson, D. W., “Capital Theory and Investment Behavior,” American Economic Review, 53 (1963), 247–259. Jovanovic, B., and P. L. Rousseau, “Why Wait? A Century of Life before IPO,” American Economic Review Papers and Proceedings, 91 (2001), 336–341. Lamont, O. A., and J. C. Stein, “Investor Sentiment and Corporate Finance: Micro and Macro,” American Economic Review Papers and Proceedings, 96 (2006), 147–151. Lando, D., Credit Risk Modeling (Princeton, NJ: Princeton University Press, 2004). Leland, H. E., “Bond Prices, Yield Spreads, and Optimal Capital Structure with Default Risk,” Working Paper 240, IBER, University of California, Berkeley, 1994. ——, “Agency Costs, Risk Management, and Capital Structure,” Journal of Finance, 53 (1998), 1213–1243. ——, “Predictions of Default Probabilities in Structural Models of Debt,” Journal of Investment Management, 2 (2004), 5–20.
1056
QUARTERLY JOURNAL OF ECONOMICS
Leland, H. E., and K. B. Toft, “Optimal Capital Structure, Endogenous Bankruptcy, and the Term Structure of Credit Spreads,” Journal of Finance, 51 (1996), 987– 1019. Lettau, M., and S. Ludvigson, “Time-Varying Risk Premia and the Cost of Capital: An Alternative Implication of the Q Theory of Investment,” Journal of Monetary Economics, 49 (2002), 31–66. Lucas, R. E., and E. C. Prescott, “Investment under Uncertainty,” Econometrica, 39 (1971), 659–682. McGrattan, E. R., and E. C. Prescott, “Unmeasured Investment and the Puzzling U.S. Boom in the 1990s,” Federal Reserve Bank of Minneapolis Staff Report 369, 2007. Merton, R. C., “On the Pricing of Corporate Debt: The Risk Structure of Interest Rates,” Journal of Finance, 29 (1974), 449–470. Mitton, T., and K. Vorkink, “Equilibrium Underdiversification and the Preference for Skewness,” Review of Financial Studies, 20 (2007), 1255–1288. Modigliani, F., and M. H. Miller, “The Cost of Capital, Corporation Finance and the Theory of Investment,” American Economic Review, 48 (1958), 261–297. Myers, S. C., “The Capital Structure Puzzle,” Journal of Finance, 39 (1984), 575– 592. Piazzesi, M., and M. Schneider, “Inflation and the Price of Real Assets,” Working Paper, Chicago GSB, 2006. Shapiro, M. D., “The Dynamic Demand for Capital and Labor,” Quarterly Journal of Economics, 101 (1986), 513–542. Shiller, R. J., Irrational Exuberance (Princeton, NJ: Princeton University Press, 2000). Stein, J. C., “Rational Capital Budgeting in an Irrational World,” Journal of Business, 69 (1996), 429–455. Stock, J. H., and M. W. Watson, “New Indexes of Coincident and Leading Economic Indicators,” in NBER Macroeconomics Annual 1989, O. J. Blanchard and S. Fisher, eds. (Cambridge, MA: NBER, 1989). ——, “Forecasting Output and Inflation: The Role of Asset Prices,” Journal of Economic Literature, 41 (2003), 788–829. Summers, L. H., “Taxation and Corporate Investment: A Q-Theory Approach,” Brookings Papers on Economic Activity, 1 (1981), 67–127. Tauchen, G., “Finite State Markov-Chain Approximations to Univariate and Vector Autoregressions,” Economics Letters, 20 (1986), 177–181. Thomas, J. K., “Is Lumpy Investment Relevant for the Business Cycle?” Journal of Political Economy, 110 (2002), 508–534. Tobin, J., “A General Equilibrium Approach to Monetary Theory,” Journal of Money, Credit and Banking, 1 (1969), 15–29. Vuolteenaho, T., “What Drives Firm-Level Stock Returns?” Journal of Finance, 57 (2002), 233–264.
THE POWER OF TV: CABLE TELEVISION AND WOMEN’S STATUS IN INDIA∗ ROBERT JENSEN AND EMILY OSTER Cable and satellite television have spread rapidly throughout the developing world. These media sources expose viewers to new information about the outside world and other ways of life, which may affect attitudes and behaviors. This paper explores the effect of the introduction of cable television on women’s status in rural India. Using a three-year, individual-level panel data set, we find that the introduction of cable television is associated with significant decreases in the reported acceptability of domestic violence toward women and son preference, as well as increases in women’s autonomy and decreases in fertility. We also find suggestive evidence that exposure to cable increases school enrollment for younger children, perhaps through increased participation of women in household decision making. We argue that the results are not driven by preexisting differential trends.
I. INTRODUCTION The growth of television in the developing world over the past two decades has been extraordinary. Estimates suggest that the number of television sets in Asia has increased more than sixfold, from 100 million to 650 million, since the 1980s (Thomas 2003). In China, television exposure grew from 18 million people in 1977 to 1 billion by 1995 (Thomas 2003). In more recent years, satellite and cable television availability has increased dramatically. Again in China, the number of people with satellite access increased from just 270,000 in 1991 to 14 million by 2005. Further, these numbers are likely to understate the change in the number of people for whom television is available, because a single television is often watched by many. Several studies have demonstrated that the information and exposure provided by television can influence a wide range of attitudes and behavior. Gentzkow and Shapiro (2004) find that television viewership in the Muslim world affects attitudes toward the West, and DellaVigna and Kaplan (2007) show large effects of the Fox News Channel on voting patterns in the United States. In the developing world, Olken (2006) shows that television decreases participation in social organizations in Indonesia, and Chong, Duryea, and La Ferrara (2007) find that exposure to soap operas in Brazil reduces fertility. ∗ Matthew Gentzkow, Larry Katz, Steve Levitt, John List, Divya Mathur, Ben Olken, Andrei Shleifer, and Jesse Shapiro provided helpful comments. Perwinder Singh provided excellent research assistance.
[email protected]. C 2009 by the President and Fellows of Harvard College and the Massachusetts Institute of
Technology. The Quarterly Journal of Economics, August 2009
1057
1058
QUARTERLY JOURNAL OF ECONOMICS
India has not been left out of the cable and satellite revolution: a recent survey finds that 112 million households in India own a television, with 61% of those homes having cable or satellite service (National Readership Studies Council 2006). This figure represents a doubling in cable access in just five years from a previous survey. The survey finds that in some states, the change has been even more dramatic; in the span of just ten to fifteen years since it first became available, cable or satellite penetration has reached an astonishing 60% in states such as Tamil Nadu, even though the average income is below the World Bank poverty line of two dollars per person per day. Beyond providing entertainment, television vastly increases both the availability of information about the outside world and exposure to other ways of life. This is especially true for remote, rural villages, where several ethnographic and anthropological studies have suggested that television is the primary channel through which households get information about life outside their village (Mankekar 1993, 1998; Fernandes 2000; Johnson 2001; Scrase 2002). Most popular cable programming features urban settings where lifestyles differ in prominent and salient ways from those in rural areas. For example, many characters on popular soap operas have more education, marry later, and have smaller families, all things rarely found in rural areas; and many female characters work outside the home, sometimes as professionals, running businesses or in other positions of authority. Anthropological accounts suggest that the growth of TV in rural areas has had large effects on a wide range of day-to-day lifestyle behaviors, including latrine building and fan usage (Johnson 2001). Yet there have been few rigorous empirical studies of the impacts that this dramatic expansion in cable access may have had on social and demographic outcomes. In this paper we explore the effect of the introduction of cable television in rural areas of India on a particular set of values and behaviors, namely attitudes toward and discrimination against women. Although issues of gender equality are important throughout the world, they are particularly salient in India. Sen (1992) argued that there were 41 million “missing women” in India—women and girls who died prematurely due to mistreatment—resulting in a dramatically male-biased population. The population bias toward men has only gotten worse in the past two decades, as sex-selective abortion has become more widely used to avoid female births (Jha et al. 2006). More broadly,
CABLE TELEVISION AND WOMEN’S STATUS IN INDIA
1059
girls in India are discriminated against in nutrition, medical care, vaccination, and education (Basu 1989; Griffiths, Matthews, and Hinde 2002; Pande 2003; Borooah 2004; Mishra, Roy, and Retherford 2004; Oster 2009). Even within India, gender inequality is significantly worse in rural than urban areas. By exposing rural households to urban attitudes and values, cable and satellite television may lead to improvements in status for rural women. It is this possibility that we explore in this paper. The primary analysis relies on a three-year panel data set covering women in five Indian states between 2001 and 2003. These years represent a time of rapid growth in rural cable access. During the panel, cable television was newly introduced in 21 of the 180 sample villages.1 Our empirical strategy relies on comparing changes in gender attitudes and behaviors between survey rounds across villages based on whether (and when) they added cable television. Using these data, we find that cable television has large effects on women’s status. After cable is introduced to a village, there are significant changes in gender attitudes: women are less likely to report that it is acceptable for a husband to beat his wife, and less likely to express a preference for sons. Behaviors traditionally associated with women’s status also change: women report increased autonomy (e.g., the ability to go out without permission and to participate in household decision making) and lower fertility. In terms of magnitude, the effects are quite large—for example, the introduction of cable decreases the differences in attitudes and behaviors between urban and rural areas by 45% to 70%. Further, these effects happen quickly, with observable impacts in the first year following cable introduction. This is consistent with existing work on the effects of media exposure, which typically find rapid changes (within a few months, in many cases) in behaviors such as contraceptive use, pregnancy, latrine building, and perception of own-village status (Pace 1993; Valente et al. 1994; Kane et al. 1998; Rogers et al. 1999; Johnson 2001). A central empirical concern is the possibility that trends in other variables (e.g., income or “modernity”) affect both cable 1. Cable television in these villages is generally introduced by an entrepreneur, who purchases a satellite dish and subscription and then charges people (generally within 1 km of the dish) to run cables to it. In this sense, people are actually accessing satellite channels. We will use the terms cable and satellite interchangeably to refer to programming not available via public broadcast signals. Our interest is with the content of programming available to households, rather than the physical means of delivery of that content.
1060
QUARTERLY JOURNAL OF ECONOMICS
access and women’s status. We argue that this does not appear to be the case, first showing visually that there are no preexisting differential trends in women’s status for villages that do and do not add cable, and that the timing of changes in outcomes is closely aligned with the introduction of cable; and second, that the outcomes are not correlated with future cable access. Policy makers and academics often argue that a significant benefit of improved status for women is increased investments in children (World Bank 2001, 2006; Qian 2008). Although our ability to look at children’s outcomes is limited, we are able to look at the effects of cable access on school enrollment. Using both our household panel data and administrative data for roughly 1,000 villages in the state of Tamil Nadu, we provide evidence that the introduction of cable increases school enrollment for younger children. Although the enrollment data have some limitations relative to the data on women’s status, we see large effects of cable that also appear to increase over time. Again, we argue these results are not driven by preexisting trends in the outcome variables. The results are potentially quite important for policy. As noted, a large literature in economics, sociology, and anthropology has explored the underlying causes of discrimination against women in India, highlighting the dowry system, low levels of female education, and other socioeconomic factors as central factors (Rosenzweig and Shultz 1982; Murthi, Guio, and Dreze 1995; Agnihotri 2000; Agnihotri, Palmer-Jones, and Parikh 2002; Rahman and Rao 2004; Qian 2008). And although progress has been made in these areas, changing the underlying factors behind low levels of education, women’s status, and high fertility has proven to be very difficult; introducing television, or reducing any barriers to its spread, may be less so. In fact, the government of Tamil Nadu has recently begun a program to provide free color televisions to 7.5 million households with the goal of ensuring that every household has one by 2011. One of the primary objectives of this program is to enable women, particularly in rural areas, to “acquire knowledge for social and economic development.” Therefore, our results also provide insight into the potential impact that this unique and nontraditional strategy can have on critical policy priorities. From a policy perspective, however, there are potential concerns about whether the changes in reported attitudes, such as toward domestic violence or son preference, represent changes in behaviors, or just in reporting. For example, we may be concerned
CABLE TELEVISION AND WOMEN’S STATUS IN INDIA
1061
that exposure to television only changes what the respondent thinks the interviewer wants to hear about the acceptability of beating, but does not actually change the incidence of beating. This is less of a concern in the case of autonomy and fertility, where women are asked about their actual behavior (and for fertility, there is less scope for misreporting because both pregnancies and recent births are likely to be observable by the interviewer). In addition, the fact that we find effects on education in administrative data provides support for an effect of cable on behavior. Without directly observing people in their homes, however, it is difficult to conclusively separate changes in reporting from changes in behavior. However, even if cable only changes what is reported, it still may represent progress: changing the perceived “correct” attitude seems like a necessary, if not sufficient, step toward changing outcomes. The remainder of the paper is organized as follows. Section II provides background on television in India and discusses existing anthropological and ethnographic evidence on the impact of television on Indian society, as well as the determinants of cable placement. Section III describes the SARI data and empirical strategy. Section IV presents the results on women’s status, and Section V the results on education. Section VI provides some discussion of magnitudes and timing of, and mechanisms behind, the results and concludes. II. TELEVISION IN INDIA II.A. Background Although television was first introduced to India in 1959, for the first three decades almost all broadcasting was in the hands of the state, and the content was primarily focused toward news or information about economic development.2 The most significant innovation in terms of both content and viewership was the introduction of satellite television in the early 1990s. In the five years from 2001 to 2006, about 30 million households, representing approximately 150 million individuals, added cable service (National Readership Studies Council 2006). And because television is often watched with family and friends by those without a television 2. The background information detailed here is drawn largely from Mankekar (1999) and http://www.indiantelevision.com/indianbrodcast/history/ historyoftele.htm.
1062
QUARTERLY JOURNAL OF ECONOMICS
or cable, the growth in actual access or exposure to cable is likely to have been even more dramatic. The program offerings on cable television are quite different from government programming. The most popular shows tend to be game shows and soap operas. For example, among the most popular shows in both 2000 and 2007 (based on Indian Nielsen ratings) is Kyunki Saas Bhi Kabhi Bahu Thi (Because a Motherin-Law Was Once a Daughter-in-Law, Also), a show based around the life of a wealthy industrial family in the large city of Mumbai. As can be seen from the title, the main themes and plots of the show revolve around issues of family and gender. The introduction of television appears in general to have had large effects on Indian society. This is particularly the case for gender, because this is an area where the lives of rural viewers differ greatly from those depicted on most popular shows. Because the most popular Indian serials take place in urban settings, women depicted on these shows are typically much more emancipated than rural women. Further, in many cases there is access to Western television, where these behaviors differ even more markedly from rural India. Based on anthropological reports, this seems to have affected attitudes within India. Scrase (2002) reports that several of his respondents thought television might lead women to question their social position and might help the cause of female advancement. Another woman reports that because of television, men and women are able to “open up a lot more” (Scrase 2002). Johnson (2001) quotes a number of respondents describing changes in gender roles as a result of television. One man notes, “Since TV has come to our village, women are doing less work than before. They only want to watch TV. So we [men] have to do more work. Many times I help my wife clean the house.” There is also a broader literature on the effects of television exposure on social and demographic outcomes in other countries. Many studies find effects on a variety of outcomes: for example, eating disorders in Fiji (Becker 2004), sex role stereotypes in Minnesota (Morgan and Rothschild 1983), and perceptions of women’s rights in Chicago (Holbert, Shah, and Kwak 2003). Telenovelas in Brazil have provided a fruitful context for studying the effects of television. For example, on the basis of ethnographic research, La Pastina (2004) argues that exposure to telenovelas provides women (in particular) with alternative models of what role they might play in society. Pace (1993) describes the effect of television introduction in Brazil on a small, isolated, Amazon community,
CABLE TELEVISION AND WOMEN’S STATUS IN INDIA
1063
arguing that the introduction of television changed the framework of social interactions, increased general world knowledge, and changed people’s perceptions about the status of their village in the wider world. Kottak (1990) reports on similar data from isolated areas in Brazil and argues that the introduction of television affects (among other things) views on gender, moving individuals in these areas toward having more liberal views on the role of women in both the workplace and relationships. And closely related to one of our outcomes, Chong, Duryea, and La Ferrara (2007) report declines in fertility in Brazil in response to access to telenovelas; they also find changes in naming patterns of children, with the names of main characters featured on these programs increasing in popularity. Interestingly, the ethnographic and anthropological studies in Brazil also suggest that the patterns of viewing shortly after television is first introduced may be quite different from what is seen later on. The evidence suggests that in the first years after introduction, interactions with the television are more intense, with the television drawing more focus (both at an individual level and community-wide). It is during this early period, as argued by Kottak (1990) and others, that television is at its most influential. Most of the villages in our analysis are at this early stage of television exposure, suggesting this may be an ideal period to look for effects. Except for Chong, Duryea, and La Ferrara (2007), much of the evidence described above is drawn from interviews and case studies, and obviously does not reflect a random sample of these populations. Nevertheless, the overall impression given by the anthropological and sociological literature is that the introduction of television has widespread effects on society, and that gender and social issues are particular focal points. Our data and setting provide an opportunity to test this hypothesis more rigorously. II.B. Placement and Timing of Cable Access In moving to a more quantitative analysis of the impacts of cable, we must recognize that variation in access is certainly nonrandom. Therefore, understanding the determinants of the timing and placement of cable is important for our ability to attribute changes in women’s status to the introduction of cable itself. To determine what drives the introduction of cable, we first conducted interviews with cable operators in Tamil Nadu. In these interviews, the operators emphasized two primary considerations:
1064
QUARTERLY JOURNAL OF ECONOMICS
access to electricity and distance to the nearest town or city. Electricity is, of course, a fundamental requirement for television. Distance is important because most operators who provide service to rural villages reside in towns or cities. Greater distances (i.e., more remote villages) increase the operator’s costs, because they often must personally travel to the village to monitor the cable setup (to ensure that it is working properly and that no unauthorized users are connecting to it), collect payments, make repairs or update equipment, or add new subscribers.3 For the remotest villages, a single trip could require an entire day. As a result, villages closest to larger towns were served first, with more distant villages being covered only after the more profitable villages were taken. Income was less often mentioned by operators as a constraint, because charges for cable access are small (about US$1–$2 per month); in separate interviews with companies marketing televisions, however, this was more of a concern. Overall, most cable operators reported that variation in access was driven largely by costs, and changes in costs, on the part of the providers themselves, rather than being demand driven. In fact, several stated that they believed demand was universal (at least in Tamil Nadu), and the only constraint on provision was the operator’s costs. In addition to these interviews, we conducted a survey of cable operators in Tamil Nadu, gathering information on cable access for over 1,000 rural villages in Tamil Nadu (these data are described in more detail in Section V). For 220 of villages in our survey that do not have cable as of 2008, surveyors recorded the reason(s) given for lack of access. For a majority of the villages (62%), the main reason was that the village was too far away; the other major reason (30%) was that the village was too small to support cable. The cable operator data can be used to examine the determinants of cable access more quantitatively by merging villages with administrative data from an education database (again, described in more detail in Section V). Doing so allows us to examine the village-level relationship between cable access and the correlates suggested above: distance to a town, population, and electrification.4 Panel A of Table I shows bivariate correlations 3. Outside of major cities, many of these operators are fairly small businesses, serving from a handful to a dozen or so villages, and so much of this work is done by the individual entrepreneurs themselves. 4. Electricity in the village is inferred from information about electricity in the schools and population is only for children ages 6–14.
CABLE TELEVISION AND WOMEN’S STATUS IN INDIA
1065
TABLE I CORRELATES OF CABLE PLACEMENT (1)
A. Bivariate correlations Have cable 2008 Year cable introduction Tamil Nadu Tamil Nadu
First variable: Sample: Second variable Electricity (0/1) Log dist. to nearest town Village pop. (1000s) Pop. density (1000s) Ave. log HH income PC Ave. education Dependent variable: Sample: Explanatory variables Electricity (0/1)
(2)
.2871∗∗∗ [1,047] −.1978∗∗∗ [1,040] .2310∗∗∗ [1,040]
(3)
(4)
Have cable in 2003 SARI
−.1700∗∗∗ [690] .4183∗∗∗ [180] .1347∗∗ [685] −.3887∗∗∗ [180] −.2028∗∗∗ [685]
.1610∗ [136] .1749∗∗ [180]
.4628∗∗∗ [180] B. Regression analysis of cable placement Have cable 2008 Year cable Have cable in 2003 introduction Tamil Nadu Tamil Nadu SARI SARI
Log dist. to nearest town Village pop., age 6–14, (in ’000s) Pop. density (in ’000s)
.2301∗∗∗ (.029) −.1111∗∗∗ (.021) .1808∗∗∗ (.036)
−1.1834∗∗∗ (.353) .6463∗∗∗ (.233) −1.4351∗∗∗ (.35)
Ave. log HH income PC Ave. education State FE Number of observations R2
N/A 1,039 .13
N/A 670 .07
.276∗∗ (.109) −.076 (.050)
.590∗ (.313) −.015 (.049) .074∗∗∗ (.021) NO 136 .26
.122 (.139) −.086∗ (.045)
.245 (.302) .073∗∗ (.047) .033 (.022) YES 136 .43
Notes. Panel A shows the village-level bivariate correlation between each variable and the corresponding cable access variable. Numbers of observations are in square brackets. Panel B shows the village-level determinants of having cable in regression form. Columns (1) and (2) use a sample from Tamil Nadu; columns (3) and (4) use the SARI sample, from five states. Standard errors (in parentheses) and significance level are reported in regressions; significance levels are indicated in the correlations. *Significant at 10%. **Significant at 5%. ***Significant at 1%.
between these variables and having cable in 2008 (column (1)) and year of cable introduction among villages that have cable in 2008 (column (2)). Consistent with the reports from cable operators, villages that are farther from a town get cable later and are overall less likely to have cable in 2008, whereas having electricity
1066
QUARTERLY JOURNAL OF ECONOMICS
or a larger population has the opposite effects. We see similar patterns in the multivariate regressions in Panel B. We can also use information from our other data source, the Survey of Aging in Rural India (SARI), to do a similar analysis of the village-level determinants of having cable at the time of the final survey in 2003; the sample size is smaller, but the survey has more variables and was conducted in five states. Column (3) in Panel A of Table I reports bivariate correlations between cable access and electricity, distance to the nearest town,5 population density, and average income and education. The results are similar to those from the Tamil Nadu data. Panel B shows multivariate regressions with these variables and, in column (4), includes state fixed effects to capture the fact that access varies significantly across states. The broad patterns remain the same. In these data, there is some evidence of a role for income as a determinant of cable access, suggesting it may be important to control for it even though it was not explicitly mentioned by cable operators. These quantitative results are supportive of the qualitative evidence from the interviews and suggest the importance of controlling for these determinants of cable access in our analysis. Under the assumption that these variables constitute the primary determinants of access, controlling for them should allow us to more convincingly attribute the changes in the outcomes to the introduction of cable. Note, however, that the R2 values in the regressions in Table I are small, indicating that much of the variation in cable access remains unexplained. One possibility, of course, is that other than these important variables, entrepreneurs choose where and when to introduce cable somewhat arbitrarily, or at least based on factors that are unlikely to have an independent effect on women’s status.6 But we certainly cannot rule out that there is some important variable that drives cable introduction that was not mentioned by cable operators and that also has an impact on our outcomes of interest. Given this, it is important to look directly at whether the introduction of cable is predictable from the levels of, or changes in, 5. The SARI survey did not define “town” for the purposes of measuring distance, and so there is likely to be some variation across villages in what this variable measures. 6. And indeed, this possibility is consistent with some of the reports of cable operators in our interviews. For example, some operators said they had chosen particular villages because they were the home villages of the person who cleaned their office or the woman in the market they bought vegetables from.
CABLE TELEVISION AND WOMEN’S STATUS IN INDIA
1067
our dependent variables. Online Appendix Table W.1 shows that, conditional on the simple controls used above, adding cable during the sample period does not appear to be systematically related to initial levels of our measures of women’s status (discussed in more detail in the next section).7 In most cases, the coefficient on adding cable is close to 0. In fact, villages that later add cable are initially slightly more inclined to report that domestic violence is acceptable and have higher initial levels of fertility. In addition to these results, we will show evidence later that getting cable in future years is not predictive of changes in outcomes. However, what we cannot rule out with our data is that there is some important unobservable that was not mentioned by cable operators that simultaneously drives year-to-year cable introduction and year-to-year variation in our outcome measures; although this seems unlikely, and we are unable to think of plausible examples, it is important to keep this caveat in mind. III. DATA AND EMPIRICAL STRATEGY III.A. Data: Survey of Aging in Rural India Our primary data set is SARI, a panel survey of 2,700 households, each containing a person age fifty or older, conducted in 2001, 2002, and 2003 in four states (Bihar, Goa, Haryana, and Tamil Nadu) and the capital, Delhi. The sample was selected in two stages: in the first stage, 180 villages were selected at random from district lists (40 villages in Bihar, Haryana, and Tamil Nadu; 35 in Delhi; and 25 in Goa), and in the second stage, 15 households were chosen within each village through random sampling based on registration lists. Other than Delhi, the survey was confined to rural areas. Attrition over the panel was low, with just 108 (4%) of the original households dropping out by the third round. All women in the sample households ages fifteen and older were interviewed (no men were interviewed). Several sections of this survey were modeled to be compatible with other demographic surveys for India, such as the National Family and Health Survey. The survey collected information on a range of (current and past) demographic, social, and economic variables. In addition, a village-level survey with local government officials 7. Online Appendix Table W.1 appears in Online Appendix W, which is accessible on the authors’ websites as well as on the Journal’s website. All Appendix tables can be found in this Supplemental Appendix.
1068
QUARTERLY JOURNAL OF ECONOMICS
gathered information on economic and social conditions and infrastructure. Basic summary statistics on the women included in the SARI sample are provided in Panel A of Table II. Women in the sample have relatively little education (an average of 3.5 years), are predominantly Hindu, and are quite poor, with an average per capita income of around US$35 per month. The SARI data contain a number of measures of women’s status. We begin with two attitude measures: son preference and the acceptability of domestic violence. For the former, women who reported wanting to have more children were asked: “Would you like your next child to be a boy, a girl, or it doesn’t matter?” Son preference is defined as wanting the next child to be a boy. For domestic violence, women were asked: “Please tell me if you think that a husband is justified in beating his wife in each of the following situations: If he suspects her of being unfaithful; if her natal family does not give expected money, jewelry, or other things; if she shows disrespect for him; if she leaves the home without telling him; if she neglects the children; if she doesn’t cook food properly.” The outcome measure we use is the number of situations in which the woman reports that beating is acceptable (0–6). Summary statistics for these two measures are in Panel B of Table II. Over 60% of women feel that it is acceptable for a husband to beat his wife under at least one of the six situations listed. On average, women report 1.6 situations in which it is considered acceptable. Women are most likely to believe beating is acceptable if a wife neglects her children, goes out without permission, or does not show respect toward her husband. Perhaps surprisingly, being unfaithful is reported as valid justification for violence by slightly fewer women. In terms of son preference, 55% of women who want another child prefer that child to be a boy.8 The residual is not simply preferring a girl; only about 13% of women want their next child to be a girl, with the remainder reporting that the sex of the child doesn’t matter (about one-quarter of the sample) or reporting something else (such as “up to God”). In addition, we analyze the effects on two behaviors associated with women’s status: household decision making (autonomy) and fertility. To measure autonomy, women were asked (separately for each of the following activities), “Who makes the following decisions in your household: Obtaining health care for 8. Note that the sample size for this variable is smaller because the question is only asked to women who want more children.
CABLE TELEVISION AND WOMEN’S STATUS IN INDIA
1069
TABLE II SUMMARY STATISTICS
Mean
Standard deviation
A. Demographic variables on SARI respondents Women’s years of education 3.6 4.5 Women’s age 31.7 8.7 Household income per capita (Rs) 1,405 5,508 Hindu (0/1) 0.859 0.348 B. Women’s status data from SARI Want next child to be a boy 0.549 Beating ever acceptable 0.621 No. of situations beating is acceptable 1.62 Husband may hit if: Unfaithful 0.244 Family doesn’t give money 0.204 Show disrespect 0.309 Go out without telling 0.307 Neglect children 0.337 Bad cook 0.216 Autonomy: Make decision about: Health care? 0.567 Purchases? 0.556 Visits to family/friends? 0.549 Autonomy: Need permission to: Go to market? 0.562 Visit family/friends? 0.676 Resp. keeps own money 0.740 Average # of children, 2001 2.4 Pregnant this year 0.070 C. Data on education (SARI and DISE) Data from SARI (village level) Enrollment rate (fixed cohort ages 6–7 in 2001) 0.761 Enrollment rate (ages 6–10) 0.803 Enrollment rate (ages 11–14) 0.696 Data from DISE Total enrollment (fixed cohort ages 6–7 in 2002) 90 Total enrollment (ages 6–10) 209 Total enrollment (ages 11–14) 114
# Obs 9,159 9,159 9,159 9,159
0.498 0.485 1.74
2,165 9,159 9,159
0.430 0.403 0.462 0.461 0.473 0.412
9,159 9,159 9,159 9,159 9,159 9,159
0.496 0.497 0.498
9,159 9,159 9,159
0.538 0.497 0.439 1.9 0.255
9,159 9,159 9,159 3,053 8,028
0.291 0.210 0.233
411 509 495
139 327 217
4,229 5,163 3,485
Notes. This table shows summary statistics for the variables used in the paper. Panels A and B use data from the SARI survey for women ages 15 and over; the unit of observation is a woman-year. The data cover years 2001–2003. Son preference questions are asked only to the subset of women who report wanting more children. In the education data in Panel C (for both SARI and the DISE) the unit of observation is a village-year, and the years covered are 2001–2003 (SARI) and 2002–2007 (DISE).
1070
QUARTERLY JOURNAL OF ECONOMICS
yourself; purchasing major household items; whether you visit or stay with family members or friends?” The possible responses were: “1. Respondent; 2. Husband; 3. Respondent jointly with husband; 4. Other household members; 5. Respondent jointly with other household members.” Women were also asked whether they need permission from their husbands to visit the market (one question) or to visit friends or relatives (a second question). Responses were coded on a scale of 1 to 3 (do not need permission, need permission, not permitted at all). Women were also asked whether they are allowed to keep money set aside to spend as they wish. Finally, fertility is measured by asking female respondents if they are currently pregnant. Later, we also use birth histories to construct earlier trends in fertility. Panel B of Table II reports summary statistics for these measures. For the decision-making variables, we condense the responses to binary indicators for whether the woman participates in the decision (either decides on her own or decides jointly with others in the household). Overall, slightly more than half of women participate in each of the decisions. There is some overlap in these variables, though not as much as might be suggested by the similarity of their means; about 20% of women do not participate in any of the decisions, 25% participate in one, 27% in two, and 29% in three. About one-half of women report needing permission to go to the market and two-thirds need permission to visit family or friends.9 By contrast, nearly three-quarters of women are allowed to keep money set aside to spend as they wish. However, by most measures, women’s autonomy overall is quite low. For our empirical analysis, we redefine the variables so that higher values always indicate greater autonomy, and we average the six variables to generate a single measure ranging from 0 to 1. Data on cable access in SARI are based on information collected in a village questionnaire that gathered information on a variety of services and infrastructure. Thus cable is measured at the village level, not the individual level. Panel A of Table III provides information on cable access throughout the survey period, which, again, was a time of rapid expansion in access. In our data, 90 of the 180 villages have cable in the first round, and an additional 11 villages added cable by the 2002 survey and another 10 added it by 2003. Finally, 69 villages never get cable during this period (no 9. Although not shown here, about 4% are not permitted to do each of these things.
CABLE TELEVISION AND WOMEN’S STATUS IN INDIA
1071
TABLE III SUMMARY STATISTICS ON CABLE AVAILABILITY Year
Number of villages with cable
A. Cable availability by survey round, SARI data 2001 90 2002 101 2003 111 Not during survey 69 State
Share of villages with cable, 2001 (%)
Number that add cable
Bihar Delhi Goa Haryana Tamil Nadu
B. Cable availability by state, SARI data 7.5 97.1 60 17.5 77.5
Year
Number of villages with new access
1989 1990 1991 1992 1993 1994 1995 1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 (Jan., Feb.)
5 0 4 6 6
C. Year of cable access, Tamil Nadu 1 5 5 3 26 44 19 21 29 93 34 67 47 43 47 47 79 33 28 5
Notes. This table shows summary statistics on cable availability. Panels A and B focus on the SARI data, showing access either over time (Panel A) or across state in the first sample year (Panel B). Panel C shows the timing of access in the Tamil Nadu sample.
1072
QUARTERLY JOURNAL OF ECONOMICS
villages dropped cable). The identification of the effects of cable in this paper relies on the 21 villages that added it in either 2002 or 2003. There is significant regional variation in access to cable in the SARI data. Panel B of Table III shows the percent of sample villages in each state with access in 2001. Not surprisingly, the capital, Delhi, has essentially universal access. Elsewhere, the two southern states of Tamil Nadu and Goa have very high cable penetration, at 78% and 60% of villages, respectively. By contrast, coverage in the two northern states of Bihar and Haryana is low (7% and 17%). Although this variation may seem extreme, or perhaps an artifact of our particular sample of rural villages, these estimates are consistent with a 2001 national census of villages (NSSO 2003). III.B. Empirical Strategy Our basic empirical strategy is to compare changes in our measures of women’s status for villages that add cable over the course of the panel relative to those that do not. We run individuallevel fixed-effects regressions of each outcome on cable availability (measured at the village level). Denote the outcome for individual i in village v in year t as sivt and the measure of cable access as cvt . The primary regression estimated is (1)
sivt = βcvt + γiv + δt + τ Xivt + ivt ,
where γiv is a full set of individual fixed effects, δt is a full set of year dummies, and the other controls, Xivt, include household income and a quadratic in age. Our identifying assumption is that villages that added cable would not otherwise have changed differently than those villages that did not add cable. We discuss this in more detail below. Although any fixed village characteristics that determine cable access will be absorbed by the fixed effects, in order to account for any possible differential trends in the outcomes by these factors, we also include in the regressions interactions between a year indicator and state dummies, income, education, age and age-squared, village population density, electrification status, and distance to nearest town. Data for all these variables except income were collected only at baseline, and thus we are only controlling for differential trends based on initial values of these variables, not for actual trends in these variables. Standard errors are adjusted for clustering at the village level.
CABLE TELEVISION AND WOMEN’S STATUS IN INDIA
1073
FIGURE I Cable Access and Television Viewership This figure shows the average share of people who report watching television at least once in the last week in the SARI data, broken down by villages that always have had cable, those that got cable for the first time in 2002, those that got it for the first time in 2003, and those that never have had cable.
IV. RESULTS: CABLE AND WOMEN’S STATUS Before turning to the effect of cable on outcomes, it is worth briefly exploring the effect of cable access on TV watching. Because all villages have long had access to broadcast television, it is ex ante unclear whether the introduction of cable will change the amount of television watched, which is potentially important as a “first stage” (although cable is likely to change the content of TV watched, even if it does not change the amount, which could also have an effect). We can use the SARI data to provide some information on this question. Figure I shows the percent of women who report they watch television at least once a week (unfortunately, the survey did not gather data on the amount of time spent watching). The graph shows viewership for each year for women in four groups of villages: those that already have cable as of 2001, those that add cable in 2002, those that add cable in 2003, and those that never get cable during our survey. Overall, there is relatively little change in watching over time in either areas that never have had cable or those that always have had it. However, in villages that get cable in 2002, the share of respondents who
1074
QUARTERLY JOURNAL OF ECONOMICS
FIGURE II Cable Access and Attitudes toward Beating This figure shows attitudes toward beating in SARI (total situations in which beating is reported acceptable), broken down by villages that always have had cable, those that got cable for the first time in 2002, those that got it for the first time in 2003, and those that never have had cable.
report watching television at least once a week jumps from 40% to 80% between 2001 and 2002; in villages that get cable in 2003, this share is constant between 2001 and 2002, and then increases sharply from 50% to 90% between 2002 and 2003. This graph suggests a strong connection between cable availability and television viewership, with a near doubling in both cases. Having established that cable increases TV watching, we turn to the effects on women’s status. Figures II–V mimic the format of Figure I and show the effects of cable access on attitudes toward spousal abuse and son preference, autonomy, and fertility. We begin with attitudes because these are, perhaps, the most likely place to see an effect of cable. Changing behaviors (autonomy, for example) may require coordination with other family members and larger scale changes in lifestyle. Changing attitudes, which may or may not be accompanied by changes in behavior, is the first and most obvious place to look for effects of exposure to other lifestyles and values. Figure II focuses on attitudes toward beating. The number of situations in which beating is reported to be acceptable by women is relatively unchanged in the villages that don’t change cable
CABLE TELEVISION AND WOMEN’S STATUS IN INDIA
1075
FIGURE III Cable Access and Son Preference This figure shows son preference (equal to 1 if the respondent reports wanting a son for the next child) from the SARI data, broken down by villages that always have had cable, those that got cable for the first time in 2002, those that got it for the first time in 2003, and those that never have had cable. The sample is only individuals who report wanting more children.
FIGURE IV Cable Access and Female Autonomy This figure shows the average of the six measures of autonomy (overall scale from 0 to 1), broken down by villages that always have had cable, those that got cable for the first time in 2002, those that got it for the first time in 2003, and those that never have had cable. Data are from the SARI survey.
1076
QUARTERLY JOURNAL OF ECONOMICS
FIGURE V Cable Access and Pregnancy This figure shows current pregnancy reported in the SARI data, broken down by villages that always have had cable, those that got cable for the first time in 2002, those that got it for the first time in 2003, and those that never have had cable.
status (either those who always have had cable or never get it). However, there is a large decrease in the average number of situations in which it is considered acceptable between 2001 and 2002 for those that get cable in 2002 and a (somewhat smaller) decrease between 2002 and 2003 for those that get cable in 2003.10 Figure III shows the same pattern for son preference: reported desire for the next child to be a son is relatively unchanged in areas with no change in cable status, but it decreases sharply between 2001 and 2002 for villages that get cable in 2002, and between 2002 and 2003 (but notably not between 2001 and 2002) for those that get cable in 2003. For both measures of attitudes, 10. The levels in this graph may appear to contradict the finding that cable reduces the acceptability of beating, because villages that always have had cable report on average much higher acceptability than those that never get it. However, this effect is driven largely by villages in Tamil Nadu, which have very high average reported acceptability of beating and high cable access. Adding state fixed effects to a regression of acceptability of beating on cable access eliminates, and in fact reverses, the apparent contradiction. Note that because our empirical analysis focuses on changes rather than levels, this effect will similarly be eliminated. It is worth noting, also, that this issue does not arise with the other outcomes, where the levels are more consistent with the changes. The “outlier” status of Tamil Nadu, with very high levels of acceptability of beating but higher levels of women’s status in other dimensions, is consistent with patterns observed elsewhere (International Institute for Population Sciences and ORC Macro 2000).
CABLE TELEVISION AND WOMEN’S STATUS IN INDIA
1077
the changes are large and striking, and correspond closely to the timing of introduction of cable. Next, we turn to measures that are reflective of behaviors, namely, women’s reports of their actual autonomy and fertility. Figure IV shows the effects of cable on female autonomy. Again, as in Figures II and III, we see no change in autonomy for areas with no changes in cable but large increases in autonomy in villages that add cable, and these changes again coincide closely with the timing of cable introduction. Finally, Figure V shows effects on fertility. Although the fertility data are noisier because the sample sizes are smaller, we see roughly the same pattern. There is no particular trend for areas that always have had or never get cable, but for areas that get cable in 2002 there is a decrease in fertility between 2001 and 2002, and for those that get cable in 2003 there is a decrease between 2002 and 2003. Panel A of Table IV turns to results from fixed-effect regressions of the form in equation (1).11 Columns (1) and (2) show the attitude results, which are consistent with the previous graphs. Adding cable is associated with a 12-percentage-point decrease in the reported preference to have the next child be a boy, and a 0.16 decrease in the number of situations in which it is considered acceptable for a man to beat his wife (relative to a base of 1.61). Column (3) shows the autonomy effects; again, consistent with Figure IV, the effect of cable is positive and statistically significant, improving the autonomy index by 0.026, from a base of about 0.65. This table also reports means and standard deviations of the outcome variables, to give a sense of the magnitude of the results. Column (4) shows the effect of cable on reported pregnancy during the sample period. Getting cable leads to approximately a 3.7-percentage-point decrease in the likelihood of pregnancy. This effect is extremely large. However, with our data, we are unable to determine whether the reduction in current pregnancies reflects a decline in the total number of births a woman will have, or simply increased spacing of births (though even the latter would reflect gains for women and, potentially, their children). As mentioned above, the most significant issue facing the basic difference-in-difference results presented above is the 11. Online Appendix Table W.2 shows these regressions without the individual fixed effects for the interested reader, in specifications both excluding and including village fixed effects. The regressions with village fixed effects look very similar to those with individual fixed effects; the regressions without either set of fixed effects look quite different.
1078
QUARTERLY JOURNAL OF ECONOMICS TABLE IV EFFECT OF CABLE TELEVISION ON WOMEN’S STATUS, SARI DATA
Dependent variable:
Pregnant at survey time Beating Son attitudes preference Autonomy 2001–2003 1997–2003 (1) (2) (3) (4) (5) A. Baseline effects of cable
Explanatory variable Village has cable
−.1608∗∗ −.0882∗∗ (.073) (.040) Dep. var. mean (SD) 1.70 0.57 (1.75) (0.49) Number of observations 7,014 1,699 R2 .01 .01
.0260∗∗∗ (.006) 0.64 (0.21) 7,014 .01
−.0379∗∗∗ −.0678∗∗ (.013) (.028) 0.072 0.13 (0.26) (0.35) 7,014 11,488 .01 .01
B. Effects of future cable Explanatory variables Village has cable
−.1516∗∗ −.0881∗∗ (.076) (.039) Cable next year .0440 .0004 (.049) (.016) Number of observations 7,014 1,699 R2 .01 .01
.0248∗∗∗ −.0414∗∗∗ (.006) (.013) −.0053 −.016 (.004) (.011) 7,014 6,959 .01 .01
−.0762∗∗ (.031) −.0253 (.024) 11,488 .01
Notes. This table shows the impact of cable TV access on attitudes toward spousal beating (column (1)), son preference (column (2)), female autonomy (column (3)), and fertility (columns (4) and (5)). Columns (1)–(4) include only the survey years, and the dependent variable is reported for attitudes, autonomy, or pregnancy. Column (5) includes 1997–2003, with pregnancy data constructed from the birth history data and excluding women in villages that have cable in 2001 because we cannot identify when they received it. Panel A includes only a measure of whether the village has cable this year. Panel B also includes a control for whether the village gets cable next year, to test for pretrends. Controls in columns (1)–(4) include individual fixed effects, year fixed effects, age, age-squared, income this year, and a linear control for year interacted with each of the following: age, age-squared, education, income this year, electricity, distance to nearest town, village population density, and state dummies. Controls in column (5) include individual fixed effects, year fixed effects, age, age-squared, and a linear control for year interacted with age and age-squared. Standard errors are in parentheses, clustered by village. ∗ Significant at 10%. ∗∗ Significant at 5%. ∗∗∗ Significant at 1%.
possibility that some unobserved variable—for example, attitudes toward “modernity”—is driving both the introduction of cable and changes in women’s status. Because variables such as this are likely to change gradually over time rather than suddenly or all at once, we might expect these effects to be evident in the form of preexisting trends in the data. That is, we would see changes in the outcomes of interest anticipating changes in cable access, because an outside variable would be driving both. We can look for evidence of this possibility first in Figures II–V. In particular, although we have a limited panel, and therefore limited scope for this test, we can look at whether villages that got cable in 2003
CABLE TELEVISION AND WOMEN’S STATUS IN INDIA
1079
show evidence of changes in women’s status between 2001 and 2002. In no case do we see evidence for this—women’s status in these villages is largely unchanged between 2001 and 2002 across all measures, and then changes sharply between 2002 and 2003. The close correspondence between the introduction of cable and sudden changes in the measures of women’s status suggests a direct link between the two, rather than preexisting trends in the outcomes. We can test statistically for the possibility of pretrends more formally by including in the regressions above an indicator for getting cable next year. This coefficient is identified from the changes between 2001 and 2002 for the ten villages that add cable in 2003, plus changes between 2002 and 2003 for four villages that were recorded to have gotten cable within three months of the 2003 survey.12 The results in columns (1)–(4) of Panel B of Table IV indicate that the effect of getting cable this year is largely unchanged, compared to Panel A, and more importantly that getting cable one year later does not predict changes in women’s status. In other words, it does not seem that there are changes in the outcomes that anticipate getting cable. Note that not only are these coefficients not statistically significant, but the point estimates are extremely small, and in most cases statistically significantly different from the effect of getting cable this year—so we can reject that any changes observed after cable is introduced are simply the continuation of preexisting trends. In addition to this evidence, for fertility (only) we can use birth histories to examine for preexisting trends over a longer period. Women in the survey were asked to report on each of the children to which they had given birth. We can use these data to construct measures of whether they were pregnant in previous years, during the same survey month. For example, for women interviewed in June 2001, we can determine whether the woman was pregnant in June 2000 based on whether she reported giving birth any time between June 2000 and March 2001. We can similarly construct measures of pregnancy going 12. Our survey team returned to all sample villages three months after the 2003 survey in order to gather some additional data, which allowed us to identify four villages that had added cable after the survey. One question with these regressions is how to treat villages that do not have cable in 2003. Some of these village might have gotten cable between our return survey and 2004, although most of them will not have. In Panel B of Table IV, we include these villages and assume they did not get cable, which preserves the sample sizes and comparability to Panel A. However, we have run these analyses excluding these villages and the results are very similar (available from the authors).
1080
QUARTERLY JOURNAL OF ECONOMICS
back several years (we go back only until 1997).13 This measure will of course be imperfect; for example, women in the early stages of pregnancy may not know or may not report they are pregnant, leading to lower apparent pregnancy rates when measured by directly asking about current pregnancies. On the other hand, because many pregnancies result in spontaneous or induced abortions rather than births, estimating presurvey pregnancies by counting backward from presurvey births will lead to an undercount. Although these measurement difficulties make our estimated effects of cable on pregnancies less precise, they should not lead to bias because they should not be systematically correlated with whether the village has cable. However, for the sake of consistency, when we generate the presurvey pregnancy measures, we adjust the reported pregnancy levels upward.14 In column (5) of Table IV we use this longer, constructed panel of fertility to control for village-specific fertility trends. While this approach adds additional years of data, we have to exclude any villages that have cable in 2001, because we do not observe when they get cable. Even with the inclusion of the village-specific fertility trends, the coefficient on cable is negative and significant, indicating that the addition of cable produces a net-of-trend drop in pregnancy.15 In fact, the point estimate is increased by the inclusion of these trends, suggesting that relative to villages that did not add cable, those adding cable were if anything actually experiencing increasing trends in fertility prior to the introduction of cable (though we would not reject that the coefficients in the two regressions are equal). Finally, column (5) of Panel B again shows no effect of future cable. Overall, the results in Table IV and Figures II–V suggest that the introduction of cable in these rural villages led to an improvement in the status of women, visible in changes in both 13. Although it is possible to go back further, evidence from validation studies of fertility histories finds that there is underreporting of births, especially when the child later dies, for births more than five years in the past but little such underreporting of births in the previous five years (Bairagi et al. 1997). 14. We do this by comparing reported pregnancies in the 2001 survey to estimated pregnancies based on reported births in the 2002 birth history. The adjustment factor is 1.43, suggesting that women start reporting pregnancy when they are around three months pregnant, which seems realistic. It is worth noting that this adjustment addresses the possibility of both undercount and overcount by simply estimating the factor that makes the two measures comparable. 15. As noted in Section III, only 21 villages in the SARI data change their cable status during the sample. This could lead to the concern that the results presented here are driven by outliers; however, we have run all of the SARI regressions, leaving out each “changer” village in turn, with no noticeable effect on the results (results available from the authors).
CABLE TELEVISION AND WOMEN’S STATUS IN INDIA
1081
attitudes and behaviors. Although improving women’s status is an important goal in and of itself, several authors have argued that human capital investments in children are greater when women have higher status within the household (see, e.g., World Bank [2001, 2006]; Qian [2008]). We therefore turn now to looking at one specific outcome, education.16 V. RESULTS: CABLE AND EDUCATION Our analysis of schooling makes use of two data sets: the SARI data and administrative data on school enrollment from Tamil Nadu. V.A. SARI Data Analysis The household roster file in the SARI survey records whether each child in the household is enrolled in school. We aggregate these data to the village level, and analyze the effect of cable access on school enrollment via three distinct groups. First, we follow enrollment among a fixed cohort of girls and boys ages 6–7 in the first survey year (2001) as they progress through school over the course of the panel. Second, we examine enrollment by sex for children ages 6–10 in the given survey year (i.e., this does not follow the same students over time; students age five in 2001 enter the group when they turn six in 2002, and students age ten in 2001 exit the group in 2002 when they turn eleven). Finally, we similarly examine enrollment by sex for children ages 11–14 in the given survey year. Summary statistics for these variables are in Panel C of Table II. Echoing the format of previous figures, Figures VIa and VIb illustrate the basic results using enrollment for children ages 6–14. Focusing first on girls, although the results are noisier than for women’s status, we do see evidence for an impact of cable on school enrollment. In the villages that do not change cable status, enrollment is either flat or slightly decreasing over the sample period. For those that get cable in 2002, enrollment increases between 2001 and 2002, and then further increases between 2002 and 2003, suggesting some increasing impact over time. For those that get cable in 2003, enrollment decreases from 2001 to 2002 (consistent with the change in villages that always have had cable) but increases from 2002 to 2003. Figure VIb, for 16. It is worth noting, of course, that the effects of cable on education may be interesting in their own right, regardless of the link with women’s status.
1082
QUARTERLY JOURNAL OF ECONOMICS
FIGURE VIa Cable Access and Education in SARI, Girls 6–14 This figure shows school enrollment for girls ages 6–14, broken down by villages that always have had cable, those that got cable for the first time in 2002, those that got it for the first time in 2003, and those that never have had cable.
FIGURE VIb Cable Access and Education in SARI, Boys 6–14 This figure shows school enrollment for boys ages 6–14, broken down by villages that always have had cable, those that got cable for the first time in 2002, those that got it for the first time in 2003, and those that never have had cable.
CABLE TELEVISION AND WOMEN’S STATUS IN INDIA
1083
boys, shows less clear results and, in particular, does not appear to demonstrate any “on-introduction” impact of cable: moving from 2001 to 2002 in the 2002 adopters or 2002 to 2003 in the 2003 adopters does not appear to increase enrollment. However, there is a large increase between 2002 and 2003 for the 2002 adopters, again perhaps pointing to a delayed effect. Although the results for women’s status suggests immediate on-introduction effects, it is possible that any effects on education could take longer, for example, because plans for education must be made further in advance and money must be saved for fees and other costs. Table V turns to the regression results. The format of regressions is the same as in the analysis of women’s status, although the controls are village-level averages and village fixed effects replace individual fixed effects.17 Panel A presents the basic results. In columns (1) and (2), which focus on the fixed cohort of children ages 6–7 in 2001, the coefficients for both groups are positive, though neither is statistically significant. Columns (3) and (4) focus on children ages 6–10 and reveal a large (12 percentage points) and statistically significant increase in enrollment for girls but no effect for boys. Columns (5) and (6) look at older children, where there is no effect for either sex. Panel B explores whether there is evidence of a delayed effect of cable on school enrollment, as was suggested by Figures VIa and VIb. The regressions include an indicator for having cable for one year and having cable for two years.18 The sample excludes the 2001 data for villages that have cable in 2001 because we do not know how long they have had cable; by 2002 we know they must have had cable for at least two years, and so we include the data for 2002 and 2003. The evidence in Panel B does show some evidence of a delayed effect—the coefficients on having cable for two years are typically larger than for having cable for one year. Again, though, the results are only statistically significant for younger girls. The fact that we do not find effects for older children may reflect the higher costs of sending older children to school (both direct outlays and opportunity cost) or that many older children have already left school and it is difficult to return after having dropped out. 17. Analysis using individual-level enrollment data, reported in Online Appendix Table W.3, yield similar results. 18. Because the variable for having cable for two years can only be 1 for villages that add cable in 2002, this result could also reflect in part a larger impact for those villages relative to those that add cable in 2003, instead of or in addition to any increasing impact of cable.
1084
QUARTERLY JOURNAL OF ECONOMICS TABLE V EFFECT OF CABLE ON EDUCATION, SARI DATA Dependent variable: Enrollment rate among Fixed cohort, ages 6–7 in 2002 Girls (1)
Boys (6)
A. Effect of cable .0777 .0332 .1177∗∗∗−.0088 (.049) (.084) (.041) (.08) 363 357 393 403 .05 .06 .10 .02
−.0378 (.071) 378 .06
−.0083 (.086) 400 .09
B. Effect of cable by years of access .1075∗ .0543 .1522∗∗∗−.0778 (.061) (.099) (.052) (.086) Cable for two years .1771∗∗ .1070 .1692∗∗∗ .1193 (.081) (.099) (.047) (.091) Number of observations 318 308 337 346 R2 .04 .11 .12 .06
−.0725 (.082) −.0517 (.103) 326 .08
−.0135 (.097) −.0041 (.09) 344 .11
C. Effect of future cable .0489 .1134∗∗∗−.0169 .0828∗ (.049) (.084) (.041) (.08) .0249 .0715 −.0186 −.041 (.052) (.05) (.046) (.036) 363 357 393 403 .06 .08 .11 .04
−.024 (.073) .0633 (.053) 378 .09
−.0038 (.087) .0291 (.051) 400 .12
Number of observations R2
Girls (3)
Boys (4)
Ages 11–14 Girls (5)
Village has cable
Boys (2)
Ages 6–10
Cable for one year
Village has cable Cable next year Number of observations R2
Notes. This table shows the impact of cable access (whether the village has cable and years of cable access) on school enrollment in the SARI data. Columns (1) and (2) consider enrollment rate among a fixed cohort ages 6–7 in 2001, columns (3) and (4) limit to children ages 6–10, and columns (5) and (6) limit to children ages 11–14. Controls in all regressions include village fixed effects, year fixed effects, average child age-squared, average yearly income, and a linear control for year interacted with each of the following: age, age-squared, education, income this year, electricity, distance to nearest town, village population density, and state dummies. Standard errors are in parentheses, clustered by village. ∗ Significant at 10%. ∗∗ Significant at 5%. ∗∗∗ Significant at 1%.
In Panel C of Table V, we address the issue of possible pretrends as we did above, by including an indicator for getting cable in the next year. As above, the inclusion of this variable does not markedly change the effects of cable access and we see no evidence that changes in enrollment anticipate cable introduction. V.B. DISE Data Analysis We provide additional evidence on the impact of cable on education using a second data set of administrative data on schools
CABLE TELEVISION AND WOMEN’S STATUS IN INDIA
1085
from Tamil Nadu compiled as part of the national District Information System for Education (DISE).19 The records contain data on enrollment and school characteristics for each school in Tamil Nadu yearly for the 2002–2003 through 2007–2008 school years. Using these data, we construct enrollment measures that parallel the SARI data. We first follow enrollment by sex for a fixed-age cohort who are 6–7 years old in the 2002–2003 school year.20 In addition, we separately analyze enrollment by sex of children age 6–10 and 11–14 in each year. Summary statistics for these variables are in Panel C of Table II. To match the DISE data, we gathered information on cable access for 1,061 villages in Tamil Nadu in March and April 2008. To select the villages, we identified five districts that had low cable penetration in 1998 (the latest year for which we have data from a separate survey, the 1998 National Family Health Survey, that allow us to identify district-level cable penetration). We chose low-access districts because our empirical analysis relies on comparing villages that added cable to those that did not, and districts with high cable penetration rates in 1998 would have had few villages adding cable after 2002 (the first year of the DISE data). Within these five districts (Salem, Tiruvannamalai, Pudokkottai, Sivaganga, and Ramanathapuram), we randomly selected a total of nineteen blocks (administrative units comprising groups of villages) and gathered information on all villages within those blocks. Surveyors were sent out to block headquarters to locate cable operators. For each village in the block, the surveyor asked the operators whether the village had cable, and in what year they got it. Approximately 63% of the villages in the sample had cable at the time of the survey, with starting dates ranging from 1989 through 2008. For the 2002 to 2007 period covered by the administrative education data, 394 villages already had cable in 2002, 277 added cable during the period, and 390 never got cable. Panel C of Table III shows the number of villages receiving cable for each year.21 These data show that the period covered by the DISE 19. These data were obtained on CD from the Tamil Nadu district education office in Chennai. 20. Note that this is close to what we do in SARI, but not identical; in the SARI data we can actually follow the same children over time. In the DISE data, we cannot be sure that the seven- to eight-year-old children in 2003 are the same as the six- to seven-year-old children in 2004. In this sense we are following a cohort of fixed age, but not fixed individuals. 21. A number of villages are recorded as having had cable at some point but then losing it. Most such cases involved switching from traditional cable access
1086
QUARTERLY JOURNAL OF ECONOMICS
data represents a time of significant cable expansion in these blocks.22 The DISE data have several advantages relative to the SARI data, including providing objective, rather than self-reported, enrollment data, and a longer time series (valuable for checking for pretrends) with more variation in the timing of access. A significant limitation of these data, however, is that records provide only raw enrollment numbers, and for most villages we have child population data for only one year of the survey (2005). As a result, we can analyze only total enrollment, not enrollment rates. Finding an effect of cable access on enrollment in these data is therefore consistent with either an increase in enrollment rates or an increase in total school-age population with no changes in the likelihood of enrollment. This is not an issue in the SARI data where we have a fixed population over time. Although with these data we are unable to fully address this issue, below we provide two pieces of evidence that, although only suggestive, do point to an enrollment effect rather than population growth. One other challenge is that our village survey only reported the year the village got cable, not the month. This makes it difficult to match with precision the timing of cable introduction and school enrollment. For example, villages that received cable in 2002 may have added it in August, when it was already too late to influence the decision on whether to enroll a child in the 2002– 2003 academic year, because the school year begins in early June and late enrollment is generally not permitted. This suggests that the effects of cable on enrollment in many cases may not show up until at earliest the academic year starting in the calendar year after cable is introduced. To capture this and any other potential delayed (or increasing) effects of cable, our empirical analysis will analyze the effect of both having cable and years of cable access. (an entrepreneur with a satellite dish sells cable connections) to direct-to-home (DTH) satellite service. We code these villages as having cable, because DTH provides access to the same programming available through cable. However, 28 villages are recorded as having lost cable for some other reason (such as failure to pay or problems with local partners). In these cases we know the beginning and ending dates of cable access and code the access accordingly. Dropping year*village observations after service is dropped to allow for the possibility of persistent or delayed effects of cable does not change the results appreciably. 22. Table III shows some evidence of “bunching” of cable access at 2000 and 2005. This may be due to misreporting or rounding in the reports by the cable operators. We have run the analysis excluding villages getting cable in 2000 or 2005 and find no significant differences (results available from the authors).
CABLE TELEVISION AND WOMEN’S STATUS IN INDIA
1087
Using these data, we follow a specification similar to that above, regressing log enrollment on whether the village has cable and years of cable access. To account for possible serial correlation, we use the Prais-Winston estimator. The controls include electrification (inferred from whether the school has electricity), population,23 and distance to the nearest town.24 The primary results are in Panel A of Table VI. Although there is no on-introduction effect of cable (the coefficient on having cable is small and insignificant) for the fixed cohort and for children under ten, there is strong evidence that continued exposure to cable (years of access) increases enrollment. For the fixed cohort ages 6–7, the effect is large, with additional years increasing enrollment by 5%. For children ages 6–10, the effect is smaller, between 1% and 2%, but still statistically significant. Finally, consistent with the SARI data, we see no effect for the older group. We can see some visual evidence of the results in Table VI for the fixed-aged cohort (including both sexes) in Figure VII. These graphs show the changes in enrollment over time, by date of cable adoption, relative to areas that never get cable. Although the figures lack the sharpness of the SARI figures, we do see evidence of increases in enrollment after cable is introduced.25 Because there is some suggestive evidence that the effect of cable on enrollment increases over time, in Panel B of Table VI we estimate the regressions excluding villages that already have cable in 2002. Doing so eliminates the increased enrollment for this portion of the previous “control group” that was still experiencing gains from having recently added cable. The coefficients are of similar magnitude, though there is some loss of precision due to the smaller sample size.26 In Panel C of Table VI, we use all villages but include a linear control for year interacted, with block dummies to allow for differential trends by block. The results are little changed in terms of magnitude, though in some cases they 23. The population data available in the DISE data are for children ages 6–14, which is a reliable proxy for overall population only under the assumption that dependency ratios are not systematically different across village types. 24. We attempted to include additional controls by merging our data with data from the Census, but differences in village names meant we would be able to match only half of the DISE villages. 25. Online Appendix Figures W.1 and W.2 show these graphs separated by sex. 26. Note that in contrast to the SARI data, the coefficient on “years of cable” in this case is identified partially from villages that already have cable in 2002, because they change their years of cable over the sample period, even if they do not change whether they have cable at all.
1088
QUARTERLY JOURNAL OF ECONOMICS TABLE VI EFFECT OF CABLE ON EDUCATION IN DISE DATA
Sample:
Fixed cohort, ages 6–7 in 2002 Girls (1)
Boys (2)
Ages 6–10 Girls (3)
Boys (4)
Ages 11–14 Girls (5)
Boys (6)
A. All villages, no block trends Explanatory variables Village has cable No. of years access Demographic controls Block-specific trends Number of observations
.0418 (.041) .0529∗∗∗ (.014) YES NO 4,289
.0233 (.042) .054∗∗∗ (.014) YES NO 4,308
−.0142 (.015) .0095∗∗ (.005) YES NO 5,165
.0055 −.0766 (.016) (.066) .0195∗∗∗ .0077 (.005) (.02) YES YES NO NO 5,164 3,578
B. Villages with cable after 2002, no block trends Explanatory variables Village has cable .0409 .0205 −.0128 .0032 −.0497 (.042) (.041) (.016) (.017) (.066) .0735∗∗∗ .0004 .0184∗∗ −.045 No. of years access .0518∗∗ (.022) (.022) (.008) (.008) (.031) Demographic controls YES YES YES YES YES Block-specific trends NO NO NO NO NO Number of observations 2,428 2,439 3,025 3,025 2,009
−.1185∗ (.067) .0047 (.021) YES NO 3,563
−.1004 (.068) −.0199 (.033) YES NO 1,988
C. All villages, block trends Explanatory variables Village has cable No. of years access Demographic controls Block-specific trends Number of observations
.0578 (.042) .0652∗∗∗ (.016) YES YES 4,289
.0382 −.0075 (.043) (.016) .0693∗∗∗ .0089∗ (.016) (.006) YES YES YES YES 4,308 5,165
.0144 −.0147 (.016) (.067) .026∗∗∗ .0242 (.006) (.023) YES YES YES YES 5,164 3,578
−.0776 (.068) .0118 (.024) YES YES 3,563
D. All villages, control for pretrends Explanatory variables No. of years access No. of years until access ×−1 Demographic controls Block-specific trends Number of observations
.0523∗∗∗ .0099∗∗ .0516∗∗∗ (.014) (.014) (.005) .0066 −.0007 −.0016 (.022) (.022) (.008) YES YES YES NO NO NO 4,289 4,308 5,165
.0209∗∗∗ .0029 (.005) (.021) .0094 −.0423 (.008) (.033) YES YES NO YES 5,164 3,578
−.0037 (.021) −.0707∗∗ (.034) YES YES 3,563
Notes. This table reports the effect of cable access on school enrollment in the DISE data from Tamil Nadu. Columns (1) and (2) use a fixed-age cohort, 6–7 years, in 2002; columns (3) and (4) use children under 10 in each year, and columns (5) and (6) use children ages 11–14 in each year. Panel B excludes villages that got cable before 2002. Panel C includes block-specific trends (a block is an administrative unit larger than a village but smaller than a district). Panel D includes a control for years until cable access; this is multiplied by −1, which means that if enrollment is increasing in anticipation of getting cable, the coefficient on this variable will be positive. Controls in all regressions include village fixed effects, year fixed effects, and a linear control for year interacted with population of children ages 6–14, distance to block headquarters, and electricity in school. Standard errors in parentheses are adjusted for serial correlation. ∗ Significant at 10%. ∗∗ Significant at 5%. ∗∗∗ Significant at 1%.
FIGURE VII Trends in Enrollment, Administrative Data from Tamil Nadu This figure shows total school enrollment of a fixed cohort of students who are ages 6–7 in 2002, across villages that adopt cable in different years. The data are drawn from administrative data in Tamil Nadu. Dotted lines indicate the first year of possible cable access; because school enrollment decisions are likely made around the middle of the year, only some villages that, for example, get cable in 2005 will have cable during the 2005–2006 school year.
CABLE TELEVISION AND WOMEN’S STATUS IN INDIA
1089
1090
QUARTERLY JOURNAL OF ECONOMICS
are less precisely estimated. Finally, in Online Appendix Tables W.4 and W.5, we show that the results are broadly robust to several alternative specifications, including analyzing enrollment by grade rather than age, focusing only on access rather than years of access, and specifying years of access as a series of dummies rather than a linear variable. As with the results above, there is concern over possible pretrends, with other variables driving both increased enrollment and adoption of cable. In fact, this is a greater concern here than above because we measure only total enrollment, not enrollment rates, and population could be increasing more rapidly (and along with it, total school enrollment) in areas adding cable. In Panel D of Table VI we attempt to address this concern by including a control for years of cable and a control for years until cable access, with the latter multiplied by −1. If there is an upward trend in enrollment before cable access, we would expect to see a positive effect on the measure of years until cable access (we multiply this variable by −1 for ease of interpretation, so that both a posttrend and a pretrend are indicated by positive values). The results do not suggest pretrends. Although the effects of years of cable remains positive and statistically significant, the effect of years until cable is positive in only two of the cases, and in both, the coefficients are small and not statistically significant. And for both boys and girls ages 11–14, the evidence indicates that in relative terms, enrollment was declining in villages that later added cable, the opposite of what we would be concerned with. The evidence in Panel D does not, however, rule out the possibility that population increases after cable is introduced, such as through higher migration to a village with cable. To test for possible population growth following the addition of cable, we use a subsample of 416 DISE villages for which we have data on schoolage population in both 2005 and 2007. For this set of villages, we regressed the change in log population between 2005 and 2007 on an indicator for having cable in 2005. The coefficient and standard error are −0.032 (0.063). Thus, the point estimate is in fact negative, that is, lower (child-age) population growth in villages that added cable, though we are unable to reject that there was no differential change for the two groups. However, we also note that the coefficient is not very precisely estimated, and we therefore cannot rule out large, positive effects of as much as 5% to 10%. So, while there is no evidence here that population trends are driving the results, we cannot fully rule out this possibility.
CABLE TELEVISION AND WOMEN’S STATUS IN INDIA
1091
Taken together, the DISE and SARI data suggest that cable leads to increased school enrollment for younger children. Given the large literature showing that increases in women’s status and decision-making authority are associated with gains in children’s outcomes, it is certainly plausible that the schooling results are related to the improvements in women’s status, such as participation in household decision making, documented earlier. Though of course we can’t rule out that enrollment may be influenced by cable through other channels, such as by providing information about the returns to schooling or government programs promoting education. VI. DISCUSSION AND CONCLUSION In this paper, we find that the introduction of cable television improves the status of women: women report lower acceptability of spousal abuse, lower son preference, more autonomy, and lower fertility. In addition, cable is associated with increases in school enrollment, perhaps itself an indicator of similar increased status and decision-making authority within the household. Thus, programs to provide televisions, such as the large program currently under way in Tamil Nadu, may in fact have significant implications for important development priorities. There are several mechanisms through which cable television may affect women’s status. For example, television may affect fertility by providing information on family planning services or changing the value of women’s time. Or women may be given more freedom to do things outside the home such as going to the market, because the value of men’s leisure is increased by television. However, one plausible mechanism is that television exposes rural households to urban lifestyles, values, and behaviors that are radically different from their own and that households begin to adopt or emulate some of these, as suggested by many anthropological and ethnographic studies of television in India (Mankekar 1993, 1998; Fernandes 2000; Johnson 2001; Scrase 2002). Certainly, the differences between rural and urban setting are marked. For example, in our SARI data, the number of situations in which women report that it is acceptable for a husband to beat his wife is 1.4 in rural areas without cable access, compared to 1.0 in urban areas; the mean autonomy measure in urban areas is 0.67, compared to 0.60 for noncable rural areas; and about half of women in urban areas want their next child to
1092
QUARTERLY JOURNAL OF ECONOMICS
be a son, compared to 67% in noncable rural areas. The addition of cable goes a long way toward closing these gaps, decreasing son preference by 12 percentage points (70%), the autonomy index by 0.025 (41%), and the number of acceptable beating situations by 0.16 (46%). Of course, we cannot causally attribute the changes observed to rural households emulating urban household values and behavior. And certainly, some of the other mechanisms mentioned earlier are likely to play some role. For outcomes such as changes in son preference and attitudes toward beating, however, it is less clear what mechanisms other than changing values and attitudes could be at play. The possibility that changes in norms, values, or attitudes lie behind these results is particularly intriguing as a contrast to typically proposed approaches to improving education and women’s status or reducing fertility. For example, for education, the emphasis is often on reducing poverty, cutting school fees, building schools, and improving school and teacher quality. For fertility, the emphasis is often on factors such as expanding access to family planning goods and services. And efforts to promote women’s status are often vague, such as calls to “empower women.” In many of these cases, the solutions (such as reducing poverty) are as difficult to accomplish as the problems they are attempting to solve, and potentially can only be achieved over a long time period and with significant resources. Because adding cable television caused none of these intermediate steps such as reducing poverty or cutting school fees, it is arguably the case that some component of these problems is the result of norms and attitudes. Although these other strategies are worthwhile, both in themselves and as solutions to the problems of education, fertility, and women’s status, and although cable clearly cannot solve any part of these problems that is in fact related to underlying structural problems such as poverty, the possibility that some of these behaviors may be changed largely because of changes in attitudes, cheaply and quickly supplied by TV, offers significant promise. As we think about policy, however, it is worth noting that the effects estimated in this paper may be larger than what would be expected if cable were introduced more widely. Although we have argued that preexisting trends in attitudes do not drive the results, we cannot rule out the possibility that television is introduced first into areas that have the biggest potential for change; those that are receptive to television may also be receptive to changing their gender attitudes. Thus, while the effect of cable is
CABLE TELEVISION AND WOMEN’S STATUS IN INDIA
1093
correctly estimated within sample, the effect of further introduction may be smaller, or slower. Nevertheless, given the magnitude of the effects estimated here, even much smaller effects could have significant impacts in India. SCHOOL OF PUBLIC AFFAIRS, UCLA, WATSON INSTITUTE FOR INTERNATIONAL STUDIES, BROWN UNIVERSITY, AND NBER UNIVERSITY OF CHICAGO AND NBER
REFERENCES Agnihotri, Satish, Sex Ratio Patterns in the Indian Population: A Fresh Exploration (New Delhi, India: Sage, 2000). Agnihotri, Satish, Richard Palmer-Jones, and Ashok Parikh, “Missing Women in Indian Districts: A Quantitative Analysis,” Structural Change and Economic Dynamics, 13 (2002), 285–314. Bairagi, Radheshyam, Stan Becker, Andre Kantner, Karen Allen, Ashish Datta, and Keith Purvis, “An Evaluation of the 1993–94 Bangladesh Demographic and Health Survey within the Matlab Area,” Asia-Pacific Population Research Report No. 11, 1997. Basu, Alaka, “Is Discrimination in Food Really Necessary for Explaining Sex Differentials in Childhood Mortality?” Population Studies, 43 (1989), 193–210. Becker, Anne, “Television, Disordered Eating, and Young Women in Fiji: Negotiating Body Image and Identity during Rapid Social Change,” Culture, Medicine, and Psychiatry, 28 (2004), 533–559. Borooah, Vani, “Gender Bias among Children in India in Their Diet and Immunisation against Disease,” Social Science and Medicine, 58 (2004), 1719–1731. Chong, Suzanne, Alberto Duryea, and Eliana La Ferrara, “Soap Operas and Fertility: Evidence from Brazil,” Mimeo, Bocconi University, 2007. DellaVigna, Stefano, and Ethan Kaplan, “The Fox News Effect: Media Bias and Voting,” Quarterly Journal of Economics, 122 (2007), 1187–1234. Fernandes, Leela, “Nationalizing ‘The Global’: Media Images, Cultural Politics and the Middle Class in India,” Media, Culture & Society, 22 (2000), 611–628. Gentzkow, Matthew, and Jesse Shapiro, “Media, Education and Anti-Americanism in the Muslim World,” Journal of Economic Perspectives, 18 (2004), 117–133. Griffiths, Paula, Zoe Matthews, and Andrew Hinde, “Gender, Family, and the Nutritional Status of Children in Three Culturally Contrasting States of India,” Social Science and Medicine, 55 (2002), 775–790. Holbert, R. Lance, Dhavan Shah, and Nojin Kwak, “Political Implications of PrimeTime Drama and Sitcom Use: Genres of Representation and Opinions Concerning Women’s Rights,” Journal of Communications, 53 (2003), 45–60. International Institute for Population Sciences and ORC Macro, “National Family Health Survey (NFHS-2),” Manuscript, International Institute for Population Sciences and ORC Macro, Mumbai, India, 2000. Jha, Prabhat, Rajesh Kumar, Priya Vasa, Neeraj Dhingra, Deva Thiruchelvam, and Rahim Moineddin, “Low Male-to-Female Sex Ratio of Children Born in India: National Survey of 1.1 Million Households,” Lancet, 367 (2006), 211– 218. Johnson, Kirk, “Media and Social Change: The Modernizing Influences of Television in Rural India,” Media, Culture & Society, 23 (2001), 147–169. Kane, Thomas, Mohamadou Gueye, Ilene Speizer, Sara Pacque-Margolis, and Danielle Baron, “The Impact of a Family Planning Multimedia Campaign in Bamako, Mali,” Studies in Family Planning, 29 (1998), 309–323. Kottak, Conrad, Prime-Time Society: An Anthropological Analysis of Television and Culture (Belmont, CA: Wadsworth Modern Anthropology Library, 1990). La Pastina, Antonio, “Telenovela Reception in Rural Brazil: Gendered Readings and Sexual Mores,” Critical Studies in Media Communication, 21 (2004), 162– 181.
1094
QUARTERLY JOURNAL OF ECONOMICS
Mankekar, Purnima, “National Texts and Gendered Lives: An Ethnography of Television Viewers in a North Indian City,” American Ethnologist, 20 (1993), 543–563. ——, “Entangled Spaces of Modernity: The Viewing Family, the Consuming Nation and Television in India,” Visual Anthropology Review, 14 (1998), 32–45. ——, Screening Culture, Viewing Politics: An Ethnography of Television, Womanhood and Nation in Postcolonial India (Durham, NC: Duke University Press, 1999). Mishra, Vinad, T. K. Roy, and Robert Retherford, “Sex Differentials in Childhood Feeding, Health Care, and Nutritional Status in India,” Population and Development Review, 30 (2004), 269–295. Morgan, Michael, and Nancy Rothschild, “Impact of the New Television Technology: Cable TV, Peers, and Sex-Role Cultivation in the Electronic Environment,” Youth and Society, 15 (1983), 33–50. Murthi, Mamta, Anne-Catherine Guio, and Jean Dreze, “Mortality, Fertility and Gender Bias in India: A District Level Analysis,” Population and Development Review, 21 (1995), 745–782. National Readership Studies Council, “The National Readership Study,” Manuscript, National Readership Studies Council, New Delhi, India, 2006. NSSO (National Sample Survey Organization), “Reports on Village Facilities, NSS 58th Round,” Manuscript, NSS, New Delhi, India, 2003. Olken, Ben, “Do Television and Radio Destroy Social Capital? Evidence from Indonesian Villages,” NBER Working Paper No. 12561, 2006. Oster, Emily, “Proximate Causes of Population Sex Imbalance in India,” Demography, 46 (2009), 325–339. Pace, Richard, “First-Time Televiewing in Amazonia: Television Acculturation in Gurupa, Brazil,” Ethnology, 32 (1993), 187–205. Pande, Rohini, “Selective Gender Differences in Childhood Nutrition and Immunization in Rural India: The Role of Siblings,” Demography, 40 (2003), 395– 418. Qian, Nancy, “Missing Women and the Price of Tea in China: The Effect of Relative Female Income on Sex Imbalance,” Quarterly Journal of Economics, 123 (2008), 1251–1286. Rahman, Lupin, and Vijayendra Rao, “The Determinants of Gender Equity in India: Examining Dyson and Moore’s Thesis with New Data,” Population and Development Review, 30 (2004), 239–268. Rogers, Everett, Peter Vaughan, Ramadhan Swalehe, Nagesh Rao, Peer Svenkerud, and Sururuchi Sood, “Effect of an Entertainment-Education Radio Soap Opera on Family Planning Behavior in Tanzania,” Studies in Family Planning, 30 (1999), 193–211. Rosenzweig, Mark, and T. Paul Shultz, “Market Opportunities, Genetic Endowments and Intrafamily Resource Distribution: Child Survival in India,” American Economic Review, 72 (1982), 803–815. Scrase, Timothy, “Television, the Middle Classes and the Transformation of Cultural Identities in West Bengal, India,” Gazette: The International Journal for Television Studies, 64 (2002), 323–342. Sen, Amartya, “Missing Women,” British Medical Journal, 304 (1992), 587–588. Thomas, Bella, “What the World’s Poor Watch on TV,” World Press Review, 50 (2003). Valente, Thomas, Young Mi Kim, Cheryl Lettenmaier, William Glass, and Yankuba Dibba, “Radio Promotion of Family Planning in The Gambia,” International Family Planning Perspectives, 20 (1994), 96–100. World Bank, “Engendering Development,” Manuscript, World Bank, 2001. ——, World Development Report: Equity and Development (Washington, DC: World Bank and Oxford University Press, 2006).
CULTURAL BIASES IN ECONOMIC EXCHANGE?∗ LUIGI GUISO PAOLA SAPIENZA LUIGI ZINGALES How much do cultural biases affect economic exchange? We answer this question by using data on bilateral trust between European countries. We document that this trust is affected not only by the characteristics of the country being trusted, but also by cultural aspects of the match between trusting country and trusted country, such as their history of conflicts and their religious, genetic, and somatic similarities. We then find that lower bilateral trust leads to less trade between two countries, less portfolio investment, and less direct investment, even after controlling for the characteristics of the two countries. This effect is stronger for goods that are more trust intensive. Our results suggest that perceptions rooted in culture are important (and generally omitted) determinants of economic exchange.
We always have been, we are, and I hope that we always shall be detested in France. Duke of Wellington
I. INTRODUCTION There are remarkable differences in the level of trust among European managers. When asked to score fellow managers of different countries on the basis of their trustworthiness their responses implied the following ranking (where 1 is the best and 5 the worst):1 ∗ We thank Giuseppe Nicoletti for providing the Organisation for Economic Co-operation and Development data set, Michele Gambera for providing the Morningstar portfolio data, and Roc Armenter for his excellent job as a research assistant. We also thank Franklin Allen, Marianne Baxter, Patricia Ledesma, Mitchell Petersen, Andrei Shleifer, Ren´e Stulz, and Samuel Thompson for their very helpful comments. We thank Jim Fearon and Romain Wacziarg for their help with the measure of linguistic common roots. We benefited from the comments of participants to seminars at the European University Institute, Wharton, Northwestern University, University of Chicago, University of Wisconsin, NBER Corporate Finance, International Trade, and Behavioral Meetings and Peggy Eppink for editorial help. Luigi Guiso acknowledges financial support from MURST and the EEC. Paola Sapienza acknowledges financial support from the Center for International Economics and Development at Northwestern University. Luigi Zingales acknowledges financial support from the Center for Research on Security Prices and the Stigler Center at the University of Chicago. 1. The survey was carried out by the 3i/Cranfield European Enterprise Center on a total of 1,016 managers (managing companies under 500 employees) from five major European Community countries: Great Britain (433 responses), France (127), Germany (135), Italy (185), and Spain (136). See Burns, Myers, and Bailey (1993). C 2009 by the President and Fellows of Harvard College and the Massachusetts Institute of
Technology. The Quarterly Journal of Economics, August 2009
1095
1096 View British French German Italian Spanish
QUARTERLY JOURNAL OF ECONOMICS Great Britain
France
Germany
Italy
Spain
1 4 2 3 2
4 2 3 2 4
2 1 1 1 1
5 5 5 4 5
3 3 4 5 3
Among these managers there seem to be some common views: everyone ranks German managers relatively high and Italian ones relatively low. There is also a “home-country bias”: managers trust their fellow countrymen more than what managers from other countries rank them. For instance, Italian managers rank themselves fourth in trustworthiness, while they are ranked fifth (last) by every other group. More surprising, there are some match-specific attitudes. French managers rate British managers much lower than any other ones except the Italians, which seems at odds with the ranking chosen by every other group. However, the British managers reciprocate this attitude (as the Duke of Wellington’s opening quote seems to suggest). These facts are not peculiar to this data set. As we show, they are exactly replicated in an independent and broader survey (Eurobarometer). In this paper, we use this larger data set to explain why the perception of trustworthiness differs so greatly across Europe. We also use it to explore the economic consequences of these different perceptions. To disentangle the country-specific components of trust from the match-specific ones we regress bilateral trust on fixed effects for the country receiving trust (country-of-destination fixed effects) and fixed effects for the country trusting (country-of-origin fixed effects). The country-of-destination fixed effects capture the common view about the trustworthiness of a country, which derive from the quality of the law and its enforcement. The country-oforigin fixed effects capture possible systematic differences in the way different populations answer the survey. We then try to explain bilateral trust, after controlling for the above fixed effects, with differences in information and culture. We find that geographical distance between two countries, their proximity, and the commonality between the two languages have a significant effect on bilateral trust. By contrast, bilateral trust is negatively correlated with a country’s exposure in the domestic newspapers of another country. Sharing the same legal origin (a variable that could proxy for both information and culture) has
CULTURAL BIASES IN ECONOMIC EXCHANGE?
1097
a positive and significant effect on the level of trust, as long as we do not control for the common linguistic root. Once we control for linguistic root, the commonality-of-law effect halves and becomes insignificant, suggesting that most of the effect comes from cultural commonalities. As a first pure measure for a country’s cultural tradition, we use commonality of religion. Religion had (and still has) a great impact on what is taught in school and how it is taught. Hence, we expect that two countries with the same religion tend to have similar cultures and therefore will trust each other more. Indeed, we find this to be the case. A pair of countries where 90% of citizens share the same religion (e.g., Italy and Spain) has a level of bilateral trust one-quarter of a standard deviation higher. To further measure cultural similarity between two populations, we introduce two new variables. First is the genetic distance between two populations that—as Ammerman and Cavalli-Sforza (1985) claim—reflects the history of invasions during the Neolithic Age and thus their common linguistic and cultural roots. As DeBruine (2002) has shown in an experiment, people trust people who look like them more than those who do not. We find this to be true also in our sample. A one-standard-deviation increase in genetic distance reduces the level of bilateral trust by 1.8 standard deviations. Second, we derive from Biasutti (1954) an indicator of somatic distance, based on the average frequency of specific traits (hair color, height, etc.) present in the indigenous population. People trust other people who look like them more. A onestandard-deviation increase in somatic distance decreases trust by one-quarter of a standard deviation. When we use both the aforementioned variables, only the latter remains significant. Finally, to capture the effect of more recent aspects of the cultural tradition, we use a country’s history of wars. People’s priors can be affected by their education and in particular by the history they study in school. For instance, Italian education emphasizes the struggles that led to the reunification of the country in the nineteenth century. Because the major battles during this period were fought against Austria, Italian students may develop, as our data show, a negative image of Austrians. We find that countries with a long history of wars tend to trust each other less. France and England, which have a record 198 years of war (more than ten times the average of nineteen) should exhibit a bilateral trust that is 0.7 of a standard deviation lower than average, which fully
1098
QUARTERLY JOURNAL OF ECONOMICS
accounts for the lower bilateral trust we observe between the two countries. Once we establish the cultural roots of trust, we move to study the effect of trust on international trade and investments. Unlike Anderson and Marcouiller (2002), De Groot et al. (2004), Berkowitz, Moenius, and Pistor (2006), and Nunn (2007), who look at the effect of country-level institutional variables (for either the importing or the exporting country) on trade, we look at the effect of a match-specific variable (bilateral trust) on trade and investments. We find that a higher level of bilateral trust can explain cross-country trade beyond what extended gravity models can account for, even after controlling for the better estimates of transportation costs suggested by Giuliano, Spilimbergo, and Tonon (2006). At sample means, a one-standard-deviation increase in the importer’s trust toward the exporter raises exports by 10%. Consistent with a trust-based explanation, we find that trust matters more for trade in goods that Rauch (1999) classifies as differentiated goods, which can vary greatly in quality. We then instrument trust with its long-term cultural components (the commonality in religion and in ethnic origin) and obtain much larger coefficients. Despite the fact that we pass the test of overidentifying restrictions, this difference suggests that culture is likely to affect trade through other channels besides trust. We find similar results when we analyze the pattern of foreign direct investments (FDI) and portfolio investments. A country is more willing to invest in another (either directly or via the equity market) when it trusts the other country’s citizens more. Not only do the latter results confirm our trade ones, but they also suggest that cultural effects are not limited to unsophisticated consumers, but are also present among sophisticated professionals such as mutual fund managers. Our combined results suggest that cultural relationships affect trust and are an important omitted factor in international trade and investments. In this respect, our paper is part of a new strand of literature that looks at the effect of culture on economic and political outcomes (Barro and McCleary 2003; Guiso, Sapienza, and Zingales 2003, 2004b, 2006, 2008a, 2008b; ´ Fernandez and Fogli 2007; Giuliano 2007; and Tabellini 2007, 2008). Because genetically similar countries trust each other more and thus can transfer technology faster and more effectively, our
CULTURAL BIASES IN ECONOMIC EXCHANGE?
1099
results explain the correlation between level of development and genetic distance found by Spolaore and Wacziarg (2009). Finally, our results are validated in a micro setting by Bottazzi, Da Rin, and Hellmann (2007), who find that that venture capitalists are more likely to invest in start-ups of countries they trust more. In our attempt to explain several international exchange puzzles, our paper is similar to that of Portes and Rey (2005). However, they do not consider trust as a key determinant, but instead focus on differences in information, measured as telephone traffic between two countries and the number of local foreign bank branches.2 II. BILATERAL TRUST We obtain our measures of trust from a set of surveys conducted by Eurobarometer and sponsored by the European Commission. The surveys were designed to measure public awareness of, and attitudes toward, the Common Market and other European Community institutions (see the Online Data Appendix for details). They were conducted on a representative sample of the total population of age sixteen (or fifteen depending on the wave) and older: about 1,000 individuals per country. The set of countries sampled varies over time with the enlargement of the European Union: there were five in 1970 (France, Belgium, The Netherlands, Germany, and Italy), when the first survey was conducted, and it had grown to seventeen in 1995, the last survey to which we have access (besides the five countries above, Luxembourg, Denmark, Ireland, Great Britain, Northern Ireland, Greece, Spain, Portugal, Norway, Sweden, Finland, and Austria are also included). One distinct feature of these surveys is that respondents were asked to report how much they trust their fellow citizens and how much they trust the citizens of each of the countries in the European Union. More specifically, they were asked the following: 2. Our paper is also related to those of Vlachos (2004), Morse and Shive (2006), and Cohen (2009). Morse and Shive (2006) relate portfolio choices to the degree of patriotism of a country. Cohen (2009) shows that employees’ bias toward investing in their own company is not due to information, but to some form of loyalty toward their company, which can easily be interpreted as trust. Both of these papers thus illustrate one specific dimension in which cultural biases can affect economic choices. Our paper can be seen as a generalization of Rauch and Trindade (2002). They find that the percentage of ethnic Chinese in a country helps predict the level of trade beyond the standard specification. We show that this result is not specific to ethnic networks. Any cultural barrier (or lack thereof) significantly impacts trade and investments.
1100
QUARTERLY JOURNAL OF ECONOMICS
“I would like to ask you a question about how much trust you have in people from various countries. For each, please tell me whether you have a lot of trust, some trust, not very much trust, or no trust at all.” In some of the surveys, this same question was also asked with reference to citizens of a number of non–European Union countries, including the United States, Russia, Switzerland, China, Japan, Turkey, and some Eastern and Central European countries (Bulgaria, Slovakia, Romania, Hungary, Poland, Slovenia, and the Czech Republic). To ensure a relative degree of homogeneity in trading-rule and living standards, we restrict our analysis to the countries belonging to the European Economic Area (EEA): European Union members plus Norway. These are also the countries for which we have both the trust from and to, thereby making the matrix quadratic.3 As in every survey, there may be some doubts about the way people interpret the trust question. First, there is some ambiguity on how to interpret it. In a trust game, the level of trust maps into the amount of money you are willing to risk. Here, this mapping is missing. Second, we are concerned whether a high level of trust reflects a high trust in a generic citizen of a different country or a better ability to identify the trustworthy people in a different country, which translates into a higher willingness to trust them. To address these doubts, in a separate survey we asked a sample of 1,990 individuals both the question above and the two following ones: (i) “Suppose that a random person you do not know personally receives by mistake a sum of 1,000 euros that belong to you. He or she is aware that the money belongs to you and knows your name and address. He or she can keep the money without incurring in any punishment. According to you what is the probability (a number between zero and 100) that he or she returns the money?” and (ii) “How good are you (very good, good, not very good, not good at all) in detecting people who are trustworthy?” (Guiso, Sapienza, and Zingales, 2008c). We find that the first question is highly statistically correlated with the measure of trust used in this paper, but the second one is not (the sign is actually negative, albeit not statistically significant). Hence, these data provide evidence that the reported level of trust reflects the subjective probability that a random person is trustworthy. 3. In the NBER working paper version we also considered the full rectangular matrix of trust.
CULTURAL BIASES IN ECONOMIC EXCHANGE?
1101
There can also be doubts on the external validity of this question. Glaeser et al. (2000), for instance, raise doubts on the validity of the World Values Survey (WVS) trust question (which is similar to the one we use), by showing that it is not correlated with the sender behavior in the standard trust game (Berg, Dickhaut, and McCabe 1995). However, Sapienza, Toldra, and Zingales (2007) argue that the sender’s behavior in the trust game is not a good measure of trust, because it is affected by other regarding preferences. From the trust game we can derive a better indicator of trust: the sender’s expectation about the receiver’s behavior. Sapienza, Toldra, and Zingales (2007) show that the WVS trust question as well as other similar trust questions are strongly correlated with these expectations. Furthermore, in a sample of Dutch households, Guiso, Sapienza, and Zingales (2008c) find a correlation between the answer to the WVS question on trust and the decision to invest in equity. Thus, this survey-based measure does have some external validity. This WVS-type of question measures generalized trust, the trust people have toward a random member of an identifiable group (e.g., Guiso, Sapienza, and Zingales [2004b]; McEvily et al. [2006]). This is different from personalized trust, the mutual trust people develop through repeated interactions (Greif 1993), which is more important in relational contracts. For our purposes, we first recoded the answers to the trust question, setting them to 1 (no trust at all), 2 (not very much trust), 3 (some trust), and 4 (a lot of trust). We then aggregated responses by country and year, computing the mean value of the responses to each survey. Table I shows the average level of trust that citizens from each country have toward citizens of other countries. There is considerable variation in the level of trust exhibited from one country to another. The average level of trust ranges from a minimum trust of 2.13 (the trust of Portuguese toward Austrians) to a maximum of 3.69 (the trust of Finns toward Finns). Besides this variability, in Table I we find the same three regularities found in the small survey presented in the Introduction. First, there are systematic differences in how much a given country trusts and how much it is trusted by others (see the last row and last column of Table I). For instance, Panel B shows that the Portuguese and the Greeks are those who trust the least and the Swedes those who trust the most.
3.56 2.83 2.89 3.22 2.90 3.29 2.70 2.98 2.32 2.93 2.66 — 2.13 2.65 3.53
2.90
Austria Belgium United Kingdom Denmark Netherlands Finland France Germany Greece Ireland Italy Norway Portugal Spain Sweden
Average
2.96
2.95 3.28 2.91 3.18 3.18 3.07 3.07 2.84 2.60 2.93 2.64 3.18 2.66 2.73 3.23
Bel
2.85
2.61 2.84 3.29 3.22 3.00 3.18 2.55 2.69 2.34 2.81 2.51 3.27 2.66 2.31 3.43
UK
3.05
2.95 3.01 3.13 3.39 3.29 3.30 2.96 2.97 2.56 2.99 2.70 3.53 2.66 2.73 3.57
Den
3.00
2.95 2.90 3.16 3.33 3.28 3.14 2.94 2.90 2.55 3.00 2.77 3.26 2.70 2.85 3.33
NL
2.95
2.94 2.92 2.98 3.20 3.25 3.69 2.91 2.85 2.42 2.92 2.78 — 2.18 2.71 3.49
Fin
2.79
2.62 2.92 2.32 2.86 2.72 2.92 3.18 2.85 2.78 2.81 2.66 2.93 2.91 2.37 3.04
Fra
2.84
3.09 2.75 2.62 3.12 2.84 2.89 2.74 3.50 2.31 2.78 2.63 2.99 2.54 2.66 3.13
Ger
2.59
2.52 2.45 2.54 2.61 2.59 2.68 2.53 2.51 3.21 2.50 2.40 2.52 2.41 2.47 2.88
Gre
2.77
2.55 2.75 2.61 3.02 2.80 2.92 2.72 2.59 2.55 3.33 2.37 3.01 2.51 2.57 3.26
Ire
2.53
2.43 2.40 2.51 2.53 2.35 2.51 2.43 2.36 2.33 2.65 2.80 2.65 2.55 2.61 2.81
Ita
2.99
3.00 2.91 3.06 3.50 3.30 3.48 2.97 2.92 2.40 2.93 2.78 — 2.22 2.79 3.65
Nor
2.66
2.50 2.53 2.74 2.67 2.74 2.67 2.59 2.48 2.60 2.65 2.32 2.60 3.29 2.51 2.97
Por
2.68
2.58 2.59 2.47 2.66 2.64 2.61 2.68 2.66 2.71 2.64 2.64 2.56 2.59 3.32 2.86
Spa
3.01
3.05 2.99 3.03 3.41 3.34 3.35 2.99 2.99 2.51 2.92 2.89 — 2.24 2.84 3.59
Swe
2.82 2.80 2.82 3.06 2.95 3.05 2.80 2.81 2.55 2.85 2.64 2.95 2.55 2.67 3.25
Average
Note. This table displays the average level of trust from citizens of country of origin (rows) to citizens of country of destination (columns). Trust is calculated by taking the average response to the following question: “I would like to ask you a question about how much trust you have in people from various countries. For each, please tell me whether you have a lot of trust, some trust, not very much trust, or no trust at all.” The answers are coded in the following way: 1 (no trust at all), 2 (not very much trust), 3 (some trust), 4 (a lot of trust).
Aus
Countries of origin
Countries of destination
TABLE I THE TRUST MATRIX
1102 QUARTERLY JOURNAL OF ECONOMICS
CULTURAL BIASES IN ECONOMIC EXCHANGE?
1103
To isolate these country-specific factors we run the following regression: (1)
¯ i jt = κi + λ j + Trust
n
γt Yeart + i jt ,
t=1
¯ i jt is the trust of country i for country j in the survey where Trust done at time t, κi a country-of-origin fixed effect, λ j a countryof-destination fixed effect, and Yeart calendar-year dummies. Because we are interested in trust across different populations, we drop all the observations when i = j. In Figure I, we report the fixed effects of the country of origin and the country of destination relative to Ireland (the actual estimates are reported in the Online Appendix). A Swedish citizen trusts others 17% more on average than an Irish citizen and 27% more than a Greek citizen. The least trusted population is the Italians (like in the introductory example), whereas the most trusted ones are the Swedes. Interestingly, there is a correlation between trusting and being trusted. Nordic countries are at the top of the level of trustworthiness and tend to trust others the most. Although not definitive proof, this fact suggests that people excessively apply the level of trustworthiness of their own countrymen to people from other countries. This result is also consistent with experimental evidence in Glaeser et al. (2000) and Sapienza, Toldra, and Zingales (2007). If all (or almost all) the variation in the data was explained by the attitude that citizens of a country have toward trust (being trusted), there would be little hope for relative trust to be able to affect the patterns of bilateral trade. However, country-of-origin fixed effects and country-of-destination fixed effects explain only 64% of the variability in trust. There remains a considerable portion to be explained with match-specific variables. The British, for instance, tend to trust the French even less than they trust the Italians and the Spanish and much less than they trust the Belgians and the Dutch. The French reciprocate by trusting the British as much as they trust (little) the Greeks. III. WHAT EXPLAINS BILATERAL DIFFERENCES IN TRUST? In this section we try to explain bilateral trust with matchspecific variables, after controlling for country fixed effects. To avoid understating the standard errors due to repeated
1104
QUARTERLY JOURNAL OF ECONOMICS
FIGURE I Fixed Effects of Country of Origin and Destination Relative to Ireland
CULTURAL BIASES IN ECONOMIC EXCHANGE?
1105
observations, we follow Bertrand, Duflo, and Mullainathan (2004) and collapse the data by averaging over time the residuals of regressing trust on calendar-year dummies. Hence, our regression will be (2)
¯ i j = κi + λ j + β Xi j + i j , Trust
¯ i j are the residuals of regressing trust on calendarwhere Trust year dummies averaged over time and Xi j are match-specific variables that we describe soon. III.A. Determinants of Bilateral Trust Why should countries differ in their trust toward the same population? One possibility is that these variations are just noise and, as such, it should not be correlated with any possible determinants. Another possibility is that these variations arise from differences in the information sets: more informed countries will have a better estimate, whereas poorly informed ones will have a worse estimate. The alternative is that there might be some sort of bias, in either the perception or the behavior. The British might have a distorted view of French reliability or the French might derive a special pleasure from breaching the trust of a British person. For the moment, we are going to collapse both of these latter explanations, which are difficult to separate, under the term of “cultural determinants,” but we will return to this later. Proxies for Information. As measures of information, we use the geographical distance between the two countries, their proximity, and the commonality between the two languages. The geographical distance between two countries is the log of distance in kilometers between the major cities (usually the capital) of the respective countries.4 We also add a dummy variable to indicate when two countries share a common land border (Frankel, Stein, and Wei 1995). As a measure of language commonality, we use an indicator variable equal to 1 if two countries share the same official language.5 We use the transportation cost estimates introduced by Giuliano, Spilimbergo, and Tonon (2006) as an additional 4. This measure is from Frankel, Stein, and Wei (1995). We also tried our regressions with alternative measures of distance between two countries and the results did not change substantially. Specifically, we used distance in radians of the unit circle between country centroids (Boisso and Ferrantino 1997) and the great circle between the largest cities (Fitzpatrick and Modlin 1986). 5. This variable is from Jon Haveman’s website: http://www.macalester.edu/ research/economics/PAGE/HAVEMAN/Trade.Resources/TradeData.html.
1106
QUARTERLY JOURNAL OF ECONOMICS
measure of distance. These transportation costs are measured using shipping companies’ quotes collected from Import Export Wizard (a shipping company providing transportation quotes around the world).6 To measure the level of information the citizens of one country have about citizens of another, we follow Portes and Rey (2005) and collect the number of times the country toward which trust is expressed appears in the headlines of a major newspaper in the country that expresses the trust. In Factiva, we searched the newspaper with the highest circulation for each country. For each pair of countries i and j, we recorded the number of articles in the newspaper of country i that mentioned country j or its citizens in the headline. We divided this number by the number of total news stories on foreign countries.7 In addition to these measures, we use the La Porta et al. (1998) classification of legal origin and construct a dummy variable equal to 1 when the legal system of two countries is derived from the same legal family (i.e., French, German, Scandinavian, English). Commonality in legal origin may in principle reflect the fact that citizens of countries having similar legal systems trust each other more because there is less fear of the unknown. The legal tradition is likely to be very highly correlated with a common heritage and other cultural variables. Thus, controlling for common legal origin, we underestimate the potential effect of culture in biasing the perception of trustworthiness. Proxies for Culture. The first proxy for culture is an indicator of religious similarity equal to the empirical probability that two randomly chosen individuals in two countries will share the same religion. We obtain this measure by taking the product of the fraction of individuals in country j and in country i who have religion k and then we sum across all religions k (k = Catholic, Protestant, Jewish, Muslim, Hindu, Buddhist, Orthodox, no religion, other affiliation). To calculate this variable we use the percentage of people belonging to each religious denomination from the WVS (see Guiso, Sapienza, and Zingales [2003]). Although religious differences are rooted in past history, this history is relatively recent (300–400 years) and could reflect some 6. http://www.importexportwizard.com. Specifically, we use the cost in U.S. dollars of transporting 1,000 kg of unspecified freight type load (including machinery, chemicals, etc.) with no special handling required, using the optimal combination of going through land and water to transport the goods. 7. In Factiva, we were unable to locate any newspaper from Greece and Finland. Hence, when we use press coverage the size of sample drops.
CULTURAL BIASES IN ECONOMIC EXCHANGE?
1107
comparative advantage in trading. For this reason, we resort to ethnic differences to capture deeper cultural roots. Much of the ethnic variation in Europe reflects Neolithic invasions: two-thirds of Europeans descend from Asian invaders and one-third from African invaders (Cavalli-Sforza 2000).8 To measure these ethnic differences, we use the genetic distance between indigenous populations as developed by CavalliSforza, Menozzi, and Piazza (1996).9 This measure is based on the existence of genetic or DNA polymorphism (a situation in which a gene or a DNA sequence exists in at least two different forms [alleles]). A simple example of polymorphism is the ABO blood group classification. Although ABO alleles are present in all populations, the frequency of each allele varies substantially across populations. For example, the O allele is frequent in 61% of African populations and 98% of Native American populations. These frequency differences in alleles hold true for other genes or DNA sequences as well. As a first approximation, Cavalli-Sforza, Menozzi, and Piazza (1996) derive a measure of the differences in the genetic composition between two populations by summing the differences in frequencies of these polymorphisms.10 As an alternative measure of distance between two populations, we derive an indicator of somatic distance, based on the average frequency of specific traits in the indigenous population reported by Biasutti (1954). For height, hair color (pigmentation), and cephalic index (the ratio of the length and width of the skull), Biasutti (1954) draws a map of the prevailing traits in each country in Europe. For each trait, European Union countries fall into three different categories. For hair color we have “Blond prevails,” “Mix of blond and dark,” and “Dark prevails.” We arbitrarily assign the score of 1 to the first, 2 to the second, and 3 to the third. When one’s country somatic characteristics belong to more than one category, we take the country’s most prevalent category. We then compute the somatic distance between two countries as the sum of the absolute value of the difference in each of these traits (see Online Appendix for more details). Somatic and genetic distances 8. Giuliano, Spilimbergo, and Tonon (2006) claim that genetic distance is simply a proxy for transportation costs, at least in the Neolithic Age. Historical transportation costs, however, are not identical to current ones. Before the creation of several tunnels, the Alps represented a formidable barrier to communication between Italy and the neighboring countries. Hence, when we control for today’s transportation costs in the regressions, genetic (or somatic) distance captures the historical transportation costs, which led to different cultural enclaves. 9. See also Menozzi, Piazza, and Cavalli-Sforza (1978). 10. For a more detailed description of this measure see the Online Appendix.
1108
QUARTERLY JOURNAL OF ECONOMICS
are highly correlated (.53). Hence, we will be able to use only one at a time. Besides proxies for cultural distance, both somatic and genetic distances can be interpreted as measures of genetic dissimilarities. As DeBruine (2002) has shown in an experiment, people trust people who look like them more than those who do not. Hence, these two variables might proxy for a genetic element in trust, rather than for a cultural one. Either way, however, they are a source of a potential bias that distorts an objective assessment of the trustworthiness of a foreign population. To capture these long-term elements of culture, we also use a measure of linguistic common roots created by Fearon and Laitin (2003). It is based on a count of the number of common branches two languages share in the language trees as reported by Ethnologue.11 As a last measure of culture, we compute the number of years a country pair has been engaged in a war between year 1000 and 1970. Because “history is very much a mythical construction, in the sense that it is a representation of the past linked to the establishment of an identity in the present” (Friedman 1992, pg. 195), we reconstruct wars using today’s borders. Cultural formation at school is a vehicle for prolonging the memory of facts that took place many years ago (this is why we count wars over almost a millennium). Presumably, countries that have a long history of wars and conflict will mistrust each other. As Table I shows, the clear tendency of the French to trust the British less than any other country may reflect the 198 years that these two countries have waged war against each other since year 1000. The summary statistics of these variables are reported in Table II (Panels A, C, D, and E), computed for the different samples used in the paper. III.B. Empirical Results In Table III, we report the results of our estimates on the determinants of trust according to (2). Our dependent variable 11. http://www.ethnologue.com. Two languages that come from completely different families have zero branches in common, whereas (say) English and French have one branch in common because they are both Indo-European, but English is Germanic and French is Romance. Fearon and Laitin (2003) argue that for a measure of cultural distance, the move from zero to one common node is more meaningful than a move from, say, five to six, so that a transformation with “diminishing returns” is better than simply counting common nodes. So, we use linguistic common roots = # common nodes/(1 + # common nodes), though we also tried other specifications with similar results.
1109
CULTURAL BIASES IN ECONOMIC EXCHANGE? TABLE II SUMMARY STATISTICS Mean Average trust Log of distance Common border Common language Same legal origin Religious similarity Genetic distance (FST values ×10,000) Somatic distance Fraction of years at war (1000–1970) Linguistic common roots Transportation costs Press coverage
Median Std. dev.
A. Trust and control variables 0.06 0.04 0.30 7.08 7.18 0.64 0.14 0.00 0.35 0.04 0.00 0.19 0.27 0.00 0.45 0.29 0.23 0.26 73.66 63.00 54.80
Min
Max
N
−0.62 5.16 0.00 0.00 0.00 0.00 9.00
0.90 8.12 1.00 1.00 1.00 0.87 289.00
207 207 207 207 207 207 207
2.56 0.02
3.00 0.00
1.26 0.03
0.00 0.00
5.00 0.20
207 207
0.51 186.13 0.03
0.50 185.00 0.01
0.24 17.09 0.04
0.00 160.00 0.00
0.94 249.00 0.31
180 207 179
9.94
17.83
595
1.99
3.57
595
0.00 5.16 0.00 0.00 0.00 0.00 0.00 5.08 0.00 0.72
0.31 8.12 1.00 1.00 0.87 5.00 1.00 5.52 0.94 0.99
595 595 595 595 595 595 595 595 573 474
24.18 3.53
439 439
0.31 8.12 1.00 1.00 1.00 0.87 5.00 0.94 5.52
439 439 439 439 439 439 439 413 439
B. Statistics of Canada Log of export to 14.78 14.79 1.58 partner country Average trust from 2.74 2.74 0.28 importer to exporter Press coverage 0.04 0.02 0.05 Log of distance 6.86 7.01 0.69 Common border 0.21 0.00 0.41 Common language 0.06 0.00 0.24 Religious similarity 0.33 0.32 0.26 Somatic distance 2.49 3.00 1.21 Same origin of the law 0.30 0.00 0.46 Transportation costs 5.19 5.18 0.08 Linguistic common roots 0.56 0.50 0.17 Correlation of consumption 0.89 0.90 0.06 by industry
C. OECD foreign direct investment Outward stock of FDI (log) 21.10 21.40 2.14 12.42 Average trust from country 2.77 2.77 0.27 2.10 to each partner Press coverage 0.05 0.04 0.06 0.00 Log of distance 6.78 6.96 0.72 5.16 Common border 0.24 0.00 0.43 0.00 Common language 0.09 0.00 0.28 0.00 Same legal origin 0.32 0.00 0.47 0.00 Religious similarity 0.37 0.34 0.23 0.01 Somatic distance 2.67 3.00 1.27 0.00 Linguistic common roots 0.56 0.50 0.21 0.00 Transportation costs 5.18 5.15 0.09 5.08
1110
QUARTERLY JOURNAL OF ECONOMICS TABLE II (CONTINUED) Mean
Median Std. dev.
Min
Panel D: Porfolio data (Morningstar) Percentage invested in 0.04 0.03 0.03 0.00 partner country Inverse covariance of −0.07 −0.04 0.15 −0.59 stock market returns Common border 0.21 0.00 0.41 0.00 Common language 0.03 0.00 0.14 0.00 Log of distance 6.80 6.97 0.64 5.16 Press coverage 0.04 0.02 0.04 0.00 Average trust from investing 2.89 2.89 0.30 2.31 country to partner Religious similarity 0.31 0.29 0.25 0.01 Somatic distance 2.69 3.00 1.25 0.00 Distance in the characteristics 7.32 6.67 2.37 1.83 of security laws (LLSV) Linguistic common roots 0.63 0.67 0.13 0.50 Same legal origin 0.25 0.00 0.44 0.00
Max
N
0.14
108
0.13
108
1.00 1.00 7.86 0.18 3.65
108 108 108 98 108
0.87 108 5.00 108 12.40 108 0.94 1.00
89 108
Notes. Panel A contains summary statistics for trust and for the bilateral controls. Trust is calculated by taking the average response to the following question: “I would like to ask you a question about how much trust you have in people from various countries. For each, please tell me whether you have a lot of trust, some trust, not very much trust, or no trust at all.” The answers are coded in the following way: 1 (no trust at all), 2 (not very much trust), 3 (some trust), 4 (a lot of trust). The sample statistics presented here for trust are obtained after collapsing the data by taking time averages (after partialing out time effects). Distance is the log distance between the capital of two countries. Common border is a dummy variable equal to 1 if two countries share at least one border (it is coded 1 if countries are the same). Common language is an indicator variable equal to 1 if the two countries share the same official language. Same legal origin is a dummy variable that is equal to 1 if two countries share the same origin of law (i.e., English, French, German, or Scandinavian), following the La Porta et al. (1998) classification. Religious similarity measures the fraction of people with the same religious faith in the two countries. Genetic distance is the coancestry coefficient (Reynolds, Weir, and Cockerham 1983) calculated by Cavalli-Sforza, Menozzi, and Piazza (1996). Somatic distance between two populations is based on the distance between three anthropometric measures: height, hair color (pigmentation), and cephalic index (Biasutti 1954). Number of years at war have been calculated using the current nations’ borders as definition of the countries. Linguistic common roots is based on a count of the number of common branches two languages share in the language trees as in Fearon and Laitin (2003). Transportation costs between a pair of countries are calculated following Giuliano, Spilimbergo, and Tonon (2006) as the shipping quotes in year 2006 collected by Import Export Wizard, a shipping company that calculates the surface freight estimates of transportation costs in U.S. dollars for a “1000 kg unspecified freight type load (including machinery, chemicals, etc.) with no special handling required, using the optimal combination of going through land and water to transport the goods.” Press coverage is the number of times a country name appears in the headlines of the major newspaper in each country over the total number of foreign news. Panel B shows summary statistics for the trade data set. The data contain export volume for a panel of eighteen European countries in the period between 1970 and 1996 (Source: Statistics of Canada). The correlation of consumption between pairs of countries is obtained by correlating the level of consumption by ISIC codes between country i and country j for years 1989–1994 (Source: Nicita and Olarreaga 2007). Consumption in each ISIC code/country is defined as GDP plus imports, minus exports. Panel C shows summary statistics for the FDI data. Outward stock of FDI (log) is from the OECD data and includes a panel between 1970 and 1996 of eighteen European countries. Panel D shows summary statistics for the portfolios data sets. The percentage invested in the partner country is the net portfolio investment of a given country into another country defined as the stock of cross-border holdings of equities and long- and short-term debt securities valued at market prices prevailing at the end of 2001 (from Morningstar data) divided by the sum of all foreign equity holdings plus market capitalization–foreign liabilities. The inverse of the covariance of stock market returns is calculated using monthly data for each country (DATASTREAM). Following Vlachos (2004), distance in security law regulation is the sum of the absolute difference between the score in 21 characteristics analyzed in La Porta, L´opez-de-Silanes, and Shleifer (2006).
(1)
207 .772
0.05 (0.07) −0.11∗∗∗ (0.03) −0.01 (0.05)
207 .840
207 .806
−10.00∗ (5.94)
(3) 0.11∗ (0.06) −0.05∗ (0.03) −0.01 (0.04) −1.07∗∗∗ (0.39) 0.24∗∗∗ (0.05)
(2) 0.09∗ (0.05) −0.04∗ (0.02) −0.05 (0.04) −1.16∗∗∗ (0.29) 0.15∗∗∗ (0.04) −0.06∗∗∗ (0.01)
(4)
207 .840
0.09∗ (0.05) −0.04 (0.03) −0.05 (0.04) −1.16∗∗∗ (0.29) 0.15∗∗∗ (0.04) −0.06∗∗∗ (0.01) 0.06 (5.07)
(5)
207 .854
−0.14∗∗∗ (0.04)
0.08 (0.05) −0.01 (0.02) −0.04 (0.03) −1.07∗∗∗ (0.29) 0.15∗∗∗ (0.04) −0.05∗∗∗ (0.01)
(6)
207 .858
−0.14∗∗∗ (0.03) 0.07∗∗ (0.03)
0.02 (0.06) −0.01 (0.02) −0.04 (0.03) −1.16∗∗∗ (0.29) 0.11∗∗ (0.04) −0.04∗∗∗ (0.01)
(7)
180 .832
−0.13∗∗∗ (0.03) 0.05 (0.03) 0.20∗ (0.11)
0.04 (0.06) 0.00 (0.03) −0.05 (0.04) −1.26∗∗∗ (0.39) 0.13∗∗∗ (0.05) −0.04∗∗∗ (0.01)
(8)
180 .832
−0.13∗∗∗ (0.03) 0.05 (0.03) 0.20∗ (0.11) −0.58 (1.00)
0.05 (0.06) 0.01 (0.03) −0.05 (0.04) −1.26∗∗∗ (0.39) 0.13∗∗∗ (0.05) −0.04∗∗∗ (0.01)
−0.09∗∗ (0.03) 0.05 (0.04) 0.21∗ (0.11) −1.05 (0.96) −0.73∗∗ (0.34) 154 .837
0.08 (0.06) −0.01 (0.03) −0.03 (0.04) −1.07∗∗∗ (0.39) 0.15∗∗∗ (0.05) −0.03∗∗∗ (0.01)
(9)
Notes. The dependent variable is the average trust across individuals of a given country toward citizens of other countries. To appropriately estimate the standard errors, we first regressed the observations on year fixed effects, and then we took the residual and collapsed the observations by year. Trust is calculated by taking the average response to the following question: “I would like to ask you a question about how much trust you have in people from various countries. For each, please tell me whether you have a lot of trust, some trust, not very much trust, or no trust at all.” The answers are coded in the following way: 1 (no trust at all), 2 (not very much trust), 3 (some trust), 4 (a lot of trust). All other variables are reported in the notes to Table II. The regressions include country-of-origin and country-of-destination fixed effects. Spatial corrected standard error (see Conley [1999]) are reported in parentheses. Coefficient is statistically different from zero at the ∗∗∗ 1% , ∗∗ 5%, and ∗ 10% level.
Observations R2
Press coverage
Transportation costs∗ 1,000
Linguistic common roots
Differences in GDP per capita (percentage) Same legal origin
Genetic distance
Somatic distance
Fraction of years at war (1000–1970) Religious similarity
Common border
Log (distance)
Common language
TABLE III DETERMINANT OF TRUST
CULTURAL BIASES IN ECONOMIC EXCHANGE?
1111
1112
QUARTERLY JOURNAL OF ECONOMICS
is average residual trust.12 Because in regression (1) we removed the effect of a country-of-origin factor and a country-of-destination factor, this specification tries to capture the match-specific factor that drives trust. To correct for potential geographical clustering of our standard errors, all our OLS regressions report spatial corrected standard error (Conley 1999).13 We start by regressing the average residual trust of country i’s citizens toward citizens of country j on our proxies for differences in the information sets (column (1)). If familiarity breeds trust, we should expect that distance and common language have a positive effect on trust. More information, however, allows us to make more precise inferences about other populations’ trustworthiness, which does not necessarily imply more or less trust on average. Common language has a positive effect on trust, but in the basic specification this effect is not statistically significant. By contrast, a greater distance between two countries reduces the level of trust between them. A one-standard-deviation increase in log distance decreases trust by one-fourth of a standard deviation. The common-border dummy has a negative sign, but it is not statistically significant. In column (2), we introduce our cultural variables. The results show that cultural factors are important overall. The three cultural proxies are jointly statistically significant with an F-test of 21.6. Countries with a long history of wars tend to trust each other less. France and England, for example, which have a record of 198 years of war (more than ten times the average of nineteen) should exhibit a bilateral trust that is 0.7 of a standard deviation lower than average, which accounts for the lower bilateral trust that we observe between them. Religious similarity has a positive impact on trust: compared to a case where no common religion is shared, a match where 90% of the citizens share the same religion (e.g., Italy and Spain) raises trust by 15 percentage points (corresponding to 40% of its standard deviation). The coefficient of somatic distance shows that citizens of one country tend to be more trusting toward citizens of other countries that are somatically closer. A one-standard-deviation increase in 12. We obtained similar results (not reported) when we use as dependent variable median trust or the percentage of individuals trusting a lot. 13. Because we have both the trust from France to Great Britain and from Great Britain to France, and all of the bilateral regressors for this pair of countries are unchanged, we need to assume that their residuals are not independent. For this reason, in a previous version we clustered the standard errors at the pair-ofcountries level, with very similar results.
CULTURAL BIASES IN ECONOMIC EXCHANGE?
1113
somatic distance lowers bilateral trust by one-quarter of a standard deviation. If we modify our measure of somatic distance to include only differences in the more visible traits (hair and height), the effect is even stronger (not reported). In column (3), we substitute for somatic distance with genetic distance. The effect is similar but stronger. A one-standarddeviation increase in genetic distance lowers bilateral trust by 1.8 standard deviations. When we introduce both in the regression (column (4)), the genetic-distance coefficient drops dramatically and loses statistical significance. This is not surprising given the high correlation between these two variables. Because both are trying to capture the same dimension, we will drop the least significant of the two (i.e., genetic distance) from the following regressions. Alesina and La Ferrara (2002) document that, in the United States, differences in income are important factors in explaining trust within a community. In column (5), we try to see whether these ideas also apply to trust across communities (or countries) by inserting the relative difference in gross domestic product per capita as an additional regressor. Confirming Alesina and La Ferrara (2002), this variable has a negative and statistically significant effect on trust, but its insertion does not change the magnitude of the coefficients of the other variables substantially. Another possibility is that our cultural variables are a proxy for differences in the legal origin. If countries with a similar legal system understand each other more and trust more, it is ambiguous whether this is an information effect or a cultural effect. For this reason, in column (6), we introduce an indicator variable equal to 1 if two countries have the same legal origin. Not surprisingly, this variable has a positive and statistically significant effect. Countries with a common legal origin have one-fourth of a standard deviation higher trust. This effect reduces the impact of two of the other three cultural variables (religion similarity and somatic distance), but they remain statistically significant. Another variable that may proxy for culture, but may also proxy for ease in (verbal) communication is the commonality in linguistic roots. When we insert it in column (7), we find that it has a positive but not statistically significant effect. Interestingly, commonality of linguistic roots reduces the effect of common legal origin (which becomes insignificant) but does not affect the other cultural proxies, which remain statistically significant. Thus, even when we control for variables that, at least in part,
1114
QUARTERLY JOURNAL OF ECONOMICS
proxy for culture, our cultural variables retain an economically and statistically significant effect. Giuliano, Spilimbergo, and Tonon (2006) claim that genetic distance is just a proxy for transportation costs, which are mismeasured by the log distance between two countries. If this were the case, trust might simply be the result of trade, with little or no cultural effect. To address this concern, we add transportation costs to the regression (column (8)). Transportation costs have a negative effect on trust, but this effect is not statistically significant. More important, the coefficients of all the other variables (in particular, somatic distance) are unaffected. This result is not specific to somatic distance; with genetic distance, we reach similar conclusions. Finally, in column (9) we introduce a direct measure of the knowledge that citizens of country i have regarding the citizens of country j, as measured by press coverage. The coefficient is negative and statistically significant. The most likely interpretation of this result is that newspapers tend to report bad news and this creates a negative bias, which is stronger when more news about a country is reported. All the other results remain the same. IV. THE EFFECT OF TRUST ON TRADE Now that we have a better sense of the determinants of bilateral trust we can explore its effects. Is it true that trust (or lack thereof) has first-order economic effects, as suggested by Arrow (1972)?14 More important, can we establish that some cultural factors impact economic exchange? To do so, we try to see what the effect of trust is when inserted in traditional models of economic exchange across countries. We start with trade of goods and services. IV.A. Data The first variables we use are data on trade of goods and services assembled by Statistics of Canada. The World Trade Database is derived from UN COMTRADE data; its advantage over other data sets is that it provides bilateral trade statistics at the four-digit Standard International Trade Classification (SITC) level.15 This database provides a time series of trade value, 14. For a simple model of how small differences in trust can have first-order effects on economic decisions, see Section I of Guiso, Sapienza, and Zingales (2004a). 15. We also used an aggregate OECD data set, based on custom data, and found very similar results.
CULTURAL BIASES IN ECONOMIC EXCHANGE?
1115
disaggregated according to trading partner and four-digit SITC level for the period 1970–1996. Of this long panel we use only data for the years for which trust survey data are available (1970, 1976, 1980, 1986, 1990, 1993, 1994, and 1996). The sample statistics for the data are reported in Panel C of Table III.16 IV.B. Empirical Results Table IV estimates the effect of trust on the amount of trade between two countries according to the following model: LogExport jit = κi ∗ Yeart + λ j ∗ Yeart + βTrusti jt + δ Xi j + i jt , (3) where Export jit is the export of country j in country i in year t aggregated over four-digit SITC industries. Trusti jt is the trust of citizens of country i for citizens of country j in the survey in year t, and Xi j are bilateral-specific variables, which do not vary over time, such as distance; κi a country-of-origin fixed effect, λ j a country-of-destination fixed effect, and Yeart calendar-year dummies. De facto, regression (3) is a standard gravity regression (e.g., Anderson and van Wincoop [2003]), with the addition of our measure of trust of the importing country toward the exporting one, the Giuliano, Spilimbergo, and Tonon (2006) measure of transportation costs, country fixed effects for both the importing and the exporting countries, and calendar-year dummies. Following Anderson and van Wincoop (2003), we insert fixed exporter-byyear and importer-by-year fixed effects to account for time-variant frictions.17 Because we are looking at European countries and aggregate the statistics at the country level, we do not have any zero-flow observations, which could bias the estimates (Linders and de Groot 2006).18 The standard errors reported in brackets are corrected for spatial correlation (Conley 1999). 16. In a robustness test, as a dependent variable we used the log of the average level of export in the years following each survey: 1970–1974 with the 1970 survey, 1975–1979 with the 1976 survey, 1980–1984 with the 1980 survey, 1985–1988 with the 1986 survey, 1989–1991 with the 1990 survey, 1992 with the 1992 survey, 1993 with the 1993 survey data, 1994 with the 1994 survey data, and 1995–1996 with the 1996. The results (available from the authors) are unchanged. 17. Our results are even stronger if instead of the interaction terms we include exporter fixed effect, importer fixed effect, and year fixed effect (see Guiso, Sapienza, and Zingales [2004a]). Anderson and van Wincoop (2003) argue against the insertion of “remoteness” into the gravity equation. Our results are unchanged if we add a measure of remoteness. 18. For a theoretical justification of the use of the gravity equation, see Helpman and Krugman (1985).
Correlation of consumption between the two countries
Linguistic common roots
Same legal origin
Transportation costs
Press coverage
Common border
Log (distance)
Mean trust of people in importing country to people in exporting country Interaction between trust and diversified good Common language
OLS (2) 0.29∗ (0.17) 0.32∗∗ (0.16) −0.43∗∗∗ (0.09) 0.43∗∗∗ (0.10) −0.03 (0.93) −0.33 (0.74) 0.45∗∗∗ (0.10)
OLS (1) 0.36∗∗ (0.17) 0.58∗∗∗ (0.22) −0.31∗∗∗ (0.09) 0.49∗∗∗ (0.11) 0.45 (1.05) −1.81∗∗ (0.79)
0.37∗∗ (0.16) −0.43∗∗∗ (0.09) 0.41∗∗∗ (0.11) −0.09 (0.94) −0.28 (0.76) 0.43∗∗∗ (0.10) 0.09 (0.28)
0.25 (0.19)
OLS (3)
TABLE IV EFFECT OF TRUST ON TRADE
−0.95 (0.68)
0.82∗∗∗ (0.21) −0.57∗∗∗ (0.10) 0.41∗∗∗ (0.10) −1.34 (1.0) 0.10 (0.73) 0.36∗∗∗ (0.11)
0.34∗∗ (0.16)
OLS (4)
−1.05∗∗∗ (0.37)
0.94∗∗∗ (0.14) −0.61∗∗∗ (0.07) 0.36∗∗∗ (0.06) −0.89 (0.60) 0.63 (0.52) 0.24∗∗∗ (0.07)
1.20∗∗∗ (0.20)
IVGMM (5)
−1.82∗∗ (0.89)
0.19 (0.22) 0.83∗∗∗ (0.05) 1.04∗∗∗ (0.27) −0.73∗∗∗ (0.12) 0.35∗∗∗ (0.13) −2.83∗∗ (1.12) −1.83 (1.17) 0.57∗∗∗ (0.15)
OLS (6)
1116 QUARTERLY JOURNAL OF ECONOMICS
YES YES 595 .964
YES YES 595 .969
OLS (2) YES YES 573 .970
OLS (3) YES YES 474 .968
OLS (4)
0.090 .764 F(2,349) = 59.66
YES YES 474
IVGMM (5)
YES YES 951 .849
OLS (6)
Notes. The dependent variable is the log of the aggregate export volume from country i to country j, for a panel of seventeen countries belonging to the EEA during the period 1970–1996. All other variables are described in the notes to Table II. All regressions include an interaction between fixed effects for the country of origin and year and for the destination country and year. All columns, except column (5), report OLS regressions where the standard errors are corrected for spatial correlation (Conley 1999). The specification in column (5) is estimated using the generalized method of moments instrumental variables estimator (GMM-IV). The instruments are religious similarity and somatic distance. A test of overidentifying restrictions, Hansen’s (1982) J-statistic, is also reported for the IV regression. The test is calculated from the first-stage residuals of the estimation procedure. We also report the F-test of the excluded instruments. The first-stage regressions are reported in the Online Appendix of the paper. Coefficient is statistically different from zero at the ∗∗∗ 1% , ∗∗ 5%, and ∗ 10% level.
Exporting-country fixed effects∗ years Importing-country fixed effects∗ years Observations R2 Hansen J-statistic χ 2 p-value Test of excluded instruments
OLS (1)
TABLE IV (CONTINUED)
CULTURAL BIASES IN ECONOMIC EXCHANGE?
1117
1118
QUARTERLY JOURNAL OF ECONOMICS
As in the standard gravity equation, a greater distance between two countries negatively affects the level of exports, whereas the presence of a common border and of a common language positively affects it. All these effects are highly statistically significant. As in Giuliano, Spilimbergo, and Tonon (2006), the transportation costs measure has a negative effect on trade, which is statistically significant at the 5% level.19 After controlling for all these variables, our measure of trust has a positive and statistically significant effect on trade. The effect is also economically very large. A one-standard-deviation increase in trust increases exports to a country by ten percentage points, equal to 1.6 standard deviations. In column (2), we test the robustness of this result to the insertion of an indicator variable for commonality of legal origin. This variable can capture the fact that similar institutions foster more trade because they provide more guarantee to the parties involved (De Groot et al. 2004; Vlachos 2004). Alternatively, it can capture part of the cultural effect. This indicator variable has a positive and statistically significant effect on trade. Countries with the same legal tradition trade among themselves 1.5 times more. We find a similar effect when we introduce the commonality of linguistic roots, which does not have a statistically significant impact on trade (column (3)). Another possible objection is that trust might pick up some other cultural similarities such as commonalities in taste. If two countries share the same taste for consumption (e.g., for cheese), they might trade more. To address this problem we construct an index of similarity in consumption patterns across countries. This index is calculated by computing domestic consumption as the sum of gross domestic production in each ISIC code plus imports and minus exports between 1989 and 1994. For each pair of countries, then, we compute the correlation in consumption across ISIC sectors.20 When we insert this variable in the OLS specification of our trade regression (column (4)), the sign is negative, but not 19. In an unreported regression, we also controlled for the geographical barriers used by Giuliano, Spilimbergo, and Tonon (2006): the presence of a common sea and the presence of a mountain chain between two countries. These variables are not significant and do not affect the other results. 20. Data on consumptions are calculated by extracting data from the following data set: http://econ.worldbank.org/WBSITE/EXTERNAL/EXTDEC/ EXTRESEARCH/0,,contentMDK:21085384∼ pagePK:64214825 ∼ piPK:64214943∼ theSitePK:469382,00.html.
CULTURAL BIASES IN ECONOMIC EXCHANGE?
1119
statistically significant. The size and the statistical significance of the coefficient of trust are unaffected. A similar concern is that countries with a more similar structure of production trade more with each other. To address it, we create an index of production similarity by correlating the GDP data across sectors in the same way as described above. The results (not reported) are unchanged. There are at least three reasons to worry about these OLS results. First, although it is possible that trust fosters trade, it is equally possible that trade breeds trust. The second problem is that bilateral trust can capture the effect of other omitted variables (e.g., the existence of established trading outposts, as suggested by Rauch and Trindade [2002]). Finally, measurement errors in the trust variable may affect our results. To address these concerns we instrument our trust variable by using the generalized method of moments estimator (GMM-IV), which allows for heteroscedasticity of unknown form. As instruments we use the cultural determinants of trust (commonality of religion and somatic distance). Note that these instruments are time invariant, yet the average level of trust varies over time. These two instruments pass the Hansen J-test for overidentifying restrictions, but were we to add also the history of wars, the test would fail. The IV estimates are presented in column (5). Not only does trust retain its effect on trade, but the size of the coefficient increases fourfold. A possible explanation is that our instruments may be only weakly correlated with trust. If this is the case, then the two-stage least-squares regressions will be biased and the standard errors misleading. To address this concern, we compute the F-statistics for the joint hypothesis that the instruments’ coefficients are zero in the first-stage regression and report it at the bottom of the table. In this specification, the F-test is 59.66, comfortably above the threshold recommended by Stock and Yogo (2002). An alternative explanation for the difference in the coefficient is that our trust measure is a noisy measure of the true trust between two countries, and the increase in the coefficient would be the result of a reduction in the standard attenuation bias present when variables are measured with error. If this is the case, the true economic effect is closer to the GMM-IV estimates, which suggests a much larger result. A one-standard-deviation increase in trust increases exports to a country by 63 percentage points. The magnitude of this effect is not very different from the one
1120
QUARTERLY JOURNAL OF ECONOMICS
found by Rauch and Trindade (2002). They find that the presence of ethnic Chinese networks increases the amount of bilateral trade in differentiated goods by 60%. Alternatively, it is possible that—test of overidentifying restrictions notwithstanding—our instruments are not orthogonal to trade, but pick up a set of cultural, institutional, and legal connections that facilitate trade flows. These cultural effects must be match specific because the institutional factors are controlled for in the country-of-origin or in the country-of-destination fixed effects. If this is the case, our results suggest the importance of culture-specific factors in trade relationships. These factors can help explain the famous Rose (2000) result (confirmed by Rose and Stanley [2005]) that currency unions are associated with a very large increase in trade. Because most of the countries belonging to currency unions in the Rose (2000) sample were countries very culturally connected, where trust is higher, trade will be naturally higher once the obstacle to trade imposed by national currencies is removed. In the last column of Table IV, we test whether the impact of trust on trade varies according to what theory would suggest. Our hypothesis predicts that trust should matter more for goods whose quality can differ more. For these goods, contracts are more difficult to write and hence they are more likely to leave gaps, where trust plays a very important role. Rauch (1999) distinguishes between goods traded in an organized exchange, goods with a reference price, and differentiated goods. Clearly, goods can be traded in an organized exchange only if they are very homogeneous in quality. Similarly, they can have a reference price only if they are not too dissimilar in their intrinsic quality. Hence, Rauch’s (1999) classification can also be interpreted as a classification of the degree of trust intensiveness of the different goods.21 For this reason, in the last column of Table IV, we aggregate exports for two subsamples of industries (organized exchange and differentiated goods); then, we run the regression by using the interaction between trust and whether the good is classified as a differentiated good. The effect of trust appears to be economically and statistically indistinguishable from zero for the sample of homogeneous goods, which are traded in organized exchanges. 21. Rauch (1999) made a “conservative” and a “liberal” classification of industries. To minimize ambiguity we excluded industries that were classified in different ways under the two classifications and ran our regressions only for organized exchange goods and differentiated goods.
CULTURAL BIASES IN ECONOMIC EXCHANGE?
1121
By contrast, the effect is quantitatively large and statistically different from zero (and from the coefficient for homogeneous goods) for differentiated goods: trade in differentiated goods increases by 39% in response to a one-standard-deviation increase in trust. V. FOREIGN DIRECT INVESTMENT (FDI) If trust has an impact on trade, it should have an even bigger impact on the willingness to invest in a country. For this reason, we study the impact of trust on FDI. V.A. Data Statistics on FDI transactions and positions are based on the database developed by the OECD Directorate for Financial, Fiscal, and Enterprise Affairs. These statistics are compiled according to the concept used for balance of payments (flows) and international investment positions (stocks) statistics. We use only data for countries that belonged to the EEA for the years when trust survey data are available (1970, 1976, 1980, 1986, 1990, 1993, 1994, and 1996). According to the classification used in the balance-of-payment accounts, an FDI enterprise is an incorporated enterprise in which a foreign investor (a resident of another country) has at least 10% of the shares or voting power. As for trade, we restrict our attention to EEA country members, where the same rules for FDI apply. Summary statistics are reported in Table II, Panel D. V.B. Empirical Results Table V reports the effect of country i’s trust toward people of country j on the FDI of country i in country j. The specification is as in regression (3) except that the dependent variable is the log of the stock of FDI from country i to country j. Spatial standard errors are reported in brackets. Column (1) reports the basic specification where, in addition to mean trust, we have country fixed effects, border, language, distance, and press coverage.22 The impact of trust is positive and statistically significant. A one-standard-deviation increase in trust raises the level of FDI by 27%. This result is robust to adding 22. Because number of years at war was significant in the trade regressions, we also inserted it here. Dropping it does not affect the significance and the magnitude of other results.
1122
QUARTERLY JOURNAL OF ECONOMICS TABLE V EFFECT OF TRUST ON FDI OLS (1)
Mean trust toward people in destination country Common language Log (distance) Common border Press coverage Transportation costs Common law Linguistic common roots
Investing-country fixed effects∗ years Destination-country fixed effects∗ years Observations R2 Hansen J-statistic χ 2 p-value Test of excluded instruments in first stage
OLS (2)
1.35∗∗∗ 0.94∗ (0.51) (0.51)
OLS (3)
OLS (4)
0.70 (0.48)
0.84∗ (0.49)
0.12 0.17 −0.57∗ −0.75∗∗ (0.31) (0.29) (0.30) (0.38) −0.46∗ −0.22 −0.48∗∗ −0.56∗∗ (0.26) (0.27) (0.23) (0.24) 0.47∗∗ 0.44∗∗ 0.26 0.34 (0.20) (0.20) (0.20) (0.21) 2.65 1.67 0.76 1.00 (2.29) (2.18) (2.04) (2.24) −4.55∗∗ −0.23 −0.32 (1.76) (1.66) (1.80) 1.28∗∗∗ 1.36∗∗∗ (0.27) (0.31) −0.86 (0.55)
IVGMM (5) 6.65∗∗∗ (1.24) −2.05∗∗∗ (0.43) −0.70∗∗∗ (0.26) 0.26 (0.21) 8.97∗∗∗ (2.69) 5.13∗∗ (2.31) 1.38∗∗∗ (0.26) −2.41∗∗∗ (0.66)
YES
YES
YES
YES
YES
YES
YES
YES
YES
YES
445 .854
445 .860
445 .879
419 .880
419 0.031 .859 F(2,328) = 24.34
Notes. The dependent variable is the log of outward investment (stocks) from the OECD data (1970–1996) for seventeen countries belonging to the EEA. The independent variables are defined in the notes to Table II. All regressions include the interaction between fixed effects for the country of origin and year and fixed effects for the destination country and year. All columns, except column (5), report OLS regressions where the standard errors are corrected for spatial correlation (Conley 1999). The specification in column (5) is estimated using the generalized method of moments instrumental variables estimator (GMM-IV). The instruments are religious similarity and somatic distance. A test of overidentifying restrictions, Hansen’s (1982) J-statistic, is also reported for the IV regression. The test is calculated from the first-stage residuals of the estimation procedure. We also report the F-test of the excluded instruments. The first-stage regressions are reported in the Online Appendix of the paper. The standard errors reported in parentheses are corrected for spatial correlation (Conley 1999). Coefficient is statistically different from zero at the ∗∗∗ 1% , ∗∗ 5%, and ∗ 10% level.
an interaction between the importer- and exporter-country fixed effects and year dummies (not reported). In column (2), we insert the Giuliano, Spilimbergo, and Tonon (2006) measure of transportation costs. Transportation costs should not have a direct effect on FDI, but could have an
CULTURAL BIASES IN ECONOMIC EXCHANGE?
1123
indirect one. Transportation costs act as a barrier to trade, which might induce direct investment as a substitute to export. Alternatively, transportation costs might act as a proxy for other cultural barriers not captured by our measure of trust. There is also another, economic not cultural, effect that goes in the opposite direction: the larger the transportation costs, the larger the FDI monitoring costs. By contrast, common legal rules facilitate FDI monitoring and reduce the importance of transportation costs. Transportation costs have a negative coefficient, which is borderline statistically significant at the 10% level, suggesting that the second interpretation is more likely. When we introduce transportation costs, the effect of trust drops by 30% and becomes statistically insignificant at conventional levels. However, when we introduce an indicator variable for common law origin (column (3)), the coefficient of transportation costs drops almost to zero and becomes statistically insignificant (suggesting that it was a proxy for some cultural effect), while the effect of trust returns significant, but only at the 10% level. Countries with the same origin of the law have more than four times the level of FDI in each other. This result is consistent with Bottazzi, Da Rin, and Hellmann (2007), who find that venture capitalists are more likely to invest in start-ups of countries they trust more. The picture remains unchanged when we control for commonality of linguistic roots (column (4)). Finally, in column (5) we report the IV regression where we use religious similarity and somatic distance as instruments. When we do so, the coefficient of trust increases dramatically and is highly statistically significant. As reported in the table, these two instruments pass the Hansen J-test for overidentifying restrictions. It is not surprising that the magnitude of the impact of trust on FDI is twice as large as the impact on trade. Because FDI are long-term investments, they are more subject to contract incompleteness than any other trade, even the trade of differentiated goods. As such, they should be very trust intensive. Nevertheless, the large difference between OLS estimates and IV ones is worrisome. In principle, it could be a problem of weak instruments. However, the F-test on the coefficients of the instruments in the first-stage regression is F(2, 328) = 24.34. Alternatively, it could be because other cultural factors, correlated with religious similarity and somatic distance, greatly affect FDI. In this latter case, this result suggests that cultural relationships are an important omitted factor in FDI.
1124
QUARTERLY JOURNAL OF ECONOMICS
VI. INTERNATIONAL PORTFOLIO DIVERSIFICATION Finally, we investigate whether trust also affects the pattern of portfolio investments. By construction, portfolio investments involve investments in minority positions in foreign companies. Hence, if we do find evidence for the effect of trust, we cannot attribute it to selective behavior by the citizens of the country hosting the investment. If the French derive a special pleasure from hurting the British, they will be unable to do it selectively when the British have invested in a minority position, because their actions would mostly affect the other investors, who represent the vast majority and are unlikely to be British. This is a very demanding test, because the effect of trust on portfolio allocations is likely to be small for two reasons. First, most portfolio investments are in traded securities that are heavily monitored and regulated, where the risk of misappropriation is somewhat limited. Second, we have data only for portfolio allocations of mutual funds, which are run by sophisticated managers less likely to be subject to this type of bias. VI.A. Data Ideally, we would like to have data on the international diversification of individual investors; however, these data are not available on a consistent basis. Hence, we resort to portfolio data from institutional investors. The data we use are from Morningstar, which kindly provided us with the geographical breakdown of equity investment of European mutual funds disaggregated by country of origin. We exclude funds located in Luxembourg and Ireland when they are affiliated with companies located in other European countries. This data set includes all funds that report their positions to Morningstar (including balanced and flexible funds, for example). Note that bond investments are not included. Sample statistics are reported in Panel E of Table III. VI.B. Empirical Results Table VI reports the empirical results. The dependent variable is the percentage of the equity portfolio of mutual funds located in country i that is invested in equity of country j, where i = j. In a traditional portfolio model, the only explanatory variables would be the inverse of the covariance of stock market
CULTURAL BIASES IN ECONOMIC EXCHANGE?
1125
TABLE VI EFFECT OF TRUST ON PORTFOLIO INVESTMENT
Mean trust toward people in destination country Inverse cov. of stock market returns of country of origin and destination Common language Log (distance) Common border
OLS (1)
OLS (2)
OLS (3)
OLS (4)
OLS (5)
IVGMM (6)
0.11∗∗∗ (0.04)
0.14∗∗∗ (0.05)
0.04 (0.03)
0.15∗∗∗ (0.05)
0.09∗∗ (0.05)
0.27∗∗ (0.11)
0.01 (0.03)
−0.00 (0.04)
−0.05 (0.04)
0.02 (0.02) −0.06∗∗ (0.03) −0.01 (0.03)
−0.02 (0.03) −0.04 (0.03) −0.02 (0.03) 0.63∗∗ (0.25)
−0.02 −0.05∗ (0.03) (0.03) −0.03 −0.03 (0.02) (0.02) −0.03 −0.02 (0.02) (0.03) 0.30 0.67∗∗ (0.20) (0.26) 0.08∗∗∗ (0.02) 0.42 (0.26)
Press coverage Same legal origin Distance in security law regulation∗ 100 Linguistic common roots Observations R2 Hansen J-statistic χ 2 p-value Test of excluded instruments in first stage
108 .371
98 .402
98 .519
−0.01 (0.04)
98 .412
−0.05 (0.06)
0.01 (0.07)
−0.01 (0.03) −0.04 (0.03) −0.05∗ (0.03) 0.57∗∗ (0.26)
−0.04 (0.03) −0.03 (0.03) −0.03 (0.03) 0.90∗∗∗ (0.33)
0.86∗∗∗ (0.26) 0.14∗∗∗ (0.05) 80 .407
0.85∗∗∗ (0.30) 0.03 (0.08) 80 2.277 .131 F(2,44) = 10.18
Notes. The dependent variable measures the percentage of net portfolio investment of a given country into another country. Specifically, the dependent variable is the stock of cross-border holdings of equities and longand short-term debt securities valued at market prices prevailing at the end of 2001 (from Morningstar data) divided by the sum of all foreign equity holdings plus market capitalization of foreign liabilities. The sample includes all European Union countries. Independent variables are described in the notes to Table II. All regressions include fixed effects for the country of origin and for the destination country. All columns, except column (6), report OLS regressions where the standard errors are corrected for spatial correlation (Conley 1999). The specification in column (6) is estimated using the generalized method of moments instrumental variables estimator (GMM-IV). The instruments are religious similarity and somatic distance. A test of overidentifying restrictions, Hansen’s (1982) J-statistic, is also reported for the IV regression. The test is calculated from the first-stage residuals of the estimation procedure. We also report the F-test of the excluded instruments. The first-stage regressions are reported in the Online Appendix of the paper. Coefficient is statistically different from zero at the ∗∗∗ 1% , ∗∗ 5%, and ∗ 10% level.
returns and the weight of the country i’s stock market in the world portfolio. Because we include country fixed effects (and the data are just one cross section), this latter variable is absorbed by them. The benchmark model would have only the inverse of the covariance of stock market returns as explanatory variable. To this benchmark, we add the standard proxies for information: a dummy for common borders, a dummy for common
1126
QUARTERLY JOURNAL OF ECONOMICS
language, the logarithm of the distance between the two capitals, plus our trust variable. As column (1) (Table VI) shows, of all the traditional proxies for information, only the distance is significant, with a negative sign. The degree of trust country i has toward country j has a positive and statistically significant effect on the percentage of equity invested by country i in country j. A one-standard-deviation increase in the trust of people in country i toward people of country j increases the portfolio share of country i in country j by 3 percentage points, which corresponds to an 88% increase in the mean share. This result is robust to adding an interaction between the importer- and exporter-country fixed effects and year dummies (not reported). In column (2), we introduce Portes and Rey’s (2005) measure of press coverage, which represents a proxy for information.23 As in Portes and Rey (2005), the effect of press coverage is positive and statistically significant. Needless to say, this correlation could reflect the incentives that national press has in reporting information about countries where national investors invest more. Controlling for this additional variable does not reduce the effect of trust. In fact, the estimated coefficient is larger and remains statistically significant at the 5% level, despite the loss of observations. In column (4), we control also for common origin of the law. Not surprisingly, this variable has a positive and statistically significant effect on the portfolio investments. This effect is very strong: on average, a country invests 8 percentage points more in the equity of another country if they share the same legal origin. As for trade and FDI, the effect of commonality of legal origin captures some of the effect of trust and the coefficient of trust drops to a third. In this case, it also becomes statistically insignificant. As previously discussed, the effect of commonality of law is at least in part a cultural effect. To separate the cultural aspect from the familiarity component, we follow Vlachos (2004) and construct an index of similarity in security law based on the work of La Porta, L´opez-de-Silanes, and Shleifer (2006). This measure is computed as the sum of the absolute difference between the score 23. We also try a specification with the other control variables present in Portes and Rey (2005): telephone traffic and foreign bank branches. Unfortunately, the overlap between the two samples is small (six countries) and even trying to integrate it we end up with only 78 observations. In such a regression, the coefficient of trust is quantitatively similar, but loses statistical significance.
CULTURAL BIASES IN ECONOMIC EXCHANGE?
1127
in the 21 dimensions of the security law analyzed by La Porta, L´opez-de-Silanes, and Shleifer (2006). If common law captures the similarity of legislation, the effect should be captured by the distance in security law. As column (4) shows, distance in security law has a positive (not negative as expected) effect on portfolio investment and this effect is not statistically significant. When we control for this measure, the effect of trust returns significant and strong. Hence, commonality of law was in part capturing the effect culture has on trust. This interpretation is further supported by the results in column (5). When we introduce commonality of linguistic roots, the effect is positive and statistically significant and the effect of trust is reduced by one-third, but still retains statistical significance. In column (6), we instrument our measure of trust with commonality of religion and somatic distance, and the coefficient of trust more than doubles. As in the previous cases, this change cannot be attributed to weak instruments (the F-test on the coefficients of the instruments in the first-stage regression is equal to F(2, 64) = 13.53) and the two instruments pass the Hansen Jtest for overidentifying restrictions. Thus, either the true effect is obscured in the OLS regression by measurement errors or the instruments are capturing some other cultural links that also affect portfolio investments. Either way, these results also point to the importance of trust and cultural links as important and generally omitted factors in portfolio investments. Overall, these results suggest that an increase in trust has an economically and statistically significant effect on the level of trade, direct investments, and portfolio investments. In most of our analysis, we have referred to these effects as cultural effects because we could not distinguish among three explanations. In other words, British expectations about French trustworthiness may reflect a cultural bias of the British. Alternatively, they could reflect a cultural idiosyncrasy of the French who enjoy treating the British in a different way. Finally, they could be the result of a bad equilibrium where French misbehave more with the British because the British expect them to do so. The latter explanation finds support in an experiment in which people are shown to be less likely to behave in a trustworthy way when they are told that their opponents have low expectations about their level of trustworthiness (Reuben, Sapienza, and Zingales 2008). Hence, British mistrust may be self-fulfilling.
1128
QUARTERLY JOURNAL OF ECONOMICS
When we talk about trade and FDI, all three explanations are equally plausible. For portfolio investment, however, the latter two explanations are implausible. French companies cannot hurt British investors independently of German or Italian ones. Consequently, when we find that the level of mistrust leads the British to invest less in France, it is not because the French behave differently toward them but because the British have a biased perception of the trustworthiness of the French. VII. CONCLUSIONS In this paper we show that trust among European countries differs in systematic ways, which are correlated to their different cultural heritages. Even after controlling for a country’s institutional characteristics and for differences in the information sets, historical and cultural variables affect the propensity of the citizens of one country to trust the citizens of another country. These differences in trust seem to have economically important effects on trade, portfolio investments, and FDI. These macro results are confirmed in a micro study by Bottazzi, Da Rin, and Hellmann (2006). They find that the trust of a venture capitalist’s country toward another country positively affects his propensity to invest in a start-up of that country. Note that both of these results are obtained within the boundaries of the old European Union, which comprises fairly culturally homogeneous nations. Given that culture represents an important barrier to integration even inside the old European Union, its effect might be much larger on world trade. Cultural differences might also explain why Rose (2000) finds that, historically, currency unions have boosted trade by 235%, whereas Baldwin (2006) finds that the euro currency union increased trade by only 9%. The unions studied by Rose (2000) are among countries with very close cultural roots, such as Belgium and Luxembourg. By contrast, as this paper documents, there are still important cultural barriers within the European Union. Although our results are suggestive that these effects can be economically important, they do not allow us to derive any welfare conclusion. First, we identify these effects by looking at withincountry variations. As a result, our methodology cannot identify the impact of the average level of trust on the total volume of trade and, subsequently, the welfare implications of our results. If we assume that the effect estimated using within-country variations applies also between countries, then we have the effect that the
CULTURAL BIASES IN ECONOMIC EXCHANGE?
1129
British perception of the trustworthiness of the Dutch and French makes the British trade 30% more with the former than with the latter. Second, we document only effects on quantities, not on welfare. If it is costless for the British to substitute for French cheese with identical cheese coming from other countries they trust more, then the utility loss they suffer could be minimal. If that is not the case (and to our taste, you cannot easily substitute a French cheese with a Dutch one), then the welfare losses can be substantial. Only future research will be able to tell. EUROPEAN UNIVERSITY INSTITUTE, ENTE L. EINAUDI, AND CEPR NORTHWESTERN UNIVERSITY, NBER, AND CEPR UNIVERSITY OF CHICAGO, NBER, AND CEPR
REFERENCES Alesina, Alberto, and Eliana La Ferrara, “Who Trusts Others?” Journal of Public Economics, 85 (2002), 207–234. Ammerman, Albert J., and Luca L. Cavalli-Sforza, The Neolithic Transition and the Genetics of Populations in Europe (Princeton, NJ: Princeton University Press, 1985). Anderson, James E., and Douglas Marcouiller, “Insecurity and the Pattern of Trade: An Empirical Investigation,” The Review of Economics and Statistics, 84 (2002), 342–352. Anderson, James E., and Eric van Wincoop, “Gravity with Gravitas: A Solution to the Border Puzzle,” American Economic Review, 93 (2003), 170–192. Arrow, Kenneth, 1972, “Gifts and Exchanges,” Philosophy and Public Affairs, 1 (1972), 343–362. Baldwin, Richard, “The Euro’s Trade Effects,” European Central Bank Working Paper No. 594, 2006. Barro, Robert J., and Rachel M. McCleary, “Religion and Economic Growth,” American Sociological Review, 68 (2003), 760–781. Berg, Joyce, John Dickhaut, and Kevin McCabe, “Trust, Reciprocity, and Social History,” Games and Economic Behavior, 10 (1995), 122–142. Berkowitz, Daniel, Johannes Moenius, and Katharina Pistor, “Trade, Law, and Product Complexity,” Review of Economics and Statistics, 88 (2006), 363–373. Bertrand, Marianne, Esther Duflo, and Sendhil Mullainathan, “How Much Should We Trust Differences-in-Differences Estimates?” Quarterly Journal of Economics, 119 (2004), 249–275. Biasutti, Renato, Le Razze e i Popoli Della Terra, Vol. 1 (Turin: UTET, 1954). Boisso, Dale, and Michael Ferrantino, “Economic Distance, Cultural Distance, and Openness: Empirical Puzzles,” Journal of Economic Integration, 12 (1997), 456–484. Bottazzi, Laura, Marco Da Rin, and Thomas Hellmann, “The Importance of Trust for Investment: Evidence from Venture Capital,” ECGI-Finance Working Paper No. 187/2007, 2007. Burns, Paul, Andrew Myers, and Andy Bailey, “Cultural Stereotypes and Barriers to the Single Market,” 3i/Cranfield European Enterprise Centre, Cranfield School of Management Working Paper No. SWP 20/93, 1993. Cavalli-Sforza, Luca L., Genes, People, and Languages (Berkeley: University of California Press, 2000). Cavalli-Sforza, Luca L., Paolo Menozzi, and Alberto Piazza, The History and Geography of Human Genes (Princeton, NJ: Princeton University Press, 1996).
1130
QUARTERLY JOURNAL OF ECONOMICS
Cohen, Lauren, “Loyalty-Based Portfolio Choice,” Review of Financial Studies, 22 (2009), 1213–1245. Conley, Timothy G., “GMM Estimation with Cross Sectional Dependence,” Journal of Econometrics, 92 (1999), 1–45. DeBruine, Lisa M., “Facial Resemblance Enhances Trust,” Proceedings of the Royal Society of London B, 269 (2002), 1307–1312. De Groot, Henri L. F., Gert-Jan Linders, Piet Rietveld, and Uma Subramanian, “The Institutional Determinants of Bilateral Trade Patterns,” Kyklos, 57 (2004), 103–123. Fearon, James D., and David D. Laitin, “Ethnicity, Insurgency, and Civil War,” American Political Science Review, 97 (2003), 75–90. ´ Fernandez, Raquel, and Alessandra Fogli, “Culture: An Empirical Investigation of Beliefs, Work, and Fertility,” NBER Working Paper #W11268, 2007. Fitzpatrick, Gary L., and Marilyn J. Modlin, Direct-Line Distances (Metuchen, NJ: Scarecrow Press, 1986). Frankel, Jeffrey, Ernesto Stein, and Shang-jin Wei, “Trading Blocs and the Americas: The Natural, the Unnatural, and the Super-natural,” Journal of Development Economics, 47 (1995), 61–95. Friedman, Jonathan, “Myth, History, and Political Identity,” Cultural Anthropology, 7 (1992), 194–210. Giuliano, Paola, “Living Arrangements in Western Europe: Does Cultural Origin Matter?” Journal of the European Economic Association, 5 (2007), 927–952. Giuliano, Paola, Antonio Spilimbergo, and Giovanni Tonon, “Genetic, Cultural and Geographical Distances,” IZA Discussion Paper No. 2229, 2006. Glaeser, Edward, David Laibson, Jos´e A. Scheinkman, and Christine L. Soutter, “Measuring Trust,” Quarterly Journal of Economics, 115 (2000), 811–846. Greif, Avner, “Contract Enforceability and Economic Institutions in Early Trade: The Maghribi Traders’ Coalition,” American Economic Review, 83 (1993), 525– 548. Guiso, Luigi, Paola Sapienza, and Luigi Zingales, “People’s Opium? Religion and Economic Attitudes,” Journal of Monetary Economics, 50 (2003), 225–282. ——, “Cultural Biases in Economic Exchange,” NBER Working Paper No. 1105, 2004a. ——, “The Role of Social Capital in Financial Development,” American Economic Review, 94 (2004b), 526–556. ——, “Does Culture Affect Economic Outcomes?” Journal of Economic Perspectives, 20 (2006), 23–48. ——, “Long Term Persistence,” University of Chicago, Working Paper, 2008a. ——, “Social Capital as Good Culture,” Journal of the European Economic Association, 6 (2008b), 295–320. ——, “Trusting the Stock Market,” Journal of Finance, 63 (2008c), 2557–2600. Hansen, Lars Peter, “Large Sample Properties of Generalized Method of Moments Estimators,” Econometrica, 50 (1982), 1029–1054. Helpman, Elhanan, and Paul Krugman, Market Structure and Foreign Trade (Cambridge, MA: MIT Press, 1985). La Porta, Rafael, Florencio L´opez-de-Silanes, and Andrei Shleifer, “What Works in Securities Laws?” Journal of Finance, 61 (2006), 1–32. La Porta, Rafael, Florencio L´opez-de-Silanes, Andrei Shleifer, and Robert Vishny, “Law and Finance,” Journal of Political Economy, 106 (1998), 1113–1155. Linders, Gert-Jan M., and Henri L. F. de Groot, “Estimation of the Gravity Equation in the Presence of Zero Flows,” Tinbergen Institute Discussion Paper, 2006. McEvily, Bill, Roberto A. Weber, Cristina Bicchieri, and Violet Ho, “Can Groups Be Trusted? An Experimental Study of Collective Trust,” in The Handbook of Trust Research, Reinhard Bachmann and Akbar Zaheer, eds. (Northampton, MA: Edward Elgar, 2006). Menozzi, Paolo, Alberto Piazza, and Luca L. Cavalli-Sforza, “Synthetic Maps of Human Gene Frequencies in Europe,” Science, 201 (1978), 786–792. Morse, Adair, and Sophie Shive, “Patriotism in Your Portfolio,” University of Michigan, Working Paper, 2006. Nicita, Alessandro, and Marcelo Olarreaga, “Trade, Production, and Protection Database, 1976–2004,” World Bank Economic Review, 21 (2007), 165–171.
CULTURAL BIASES IN ECONOMIC EXCHANGE?
1131
Nunn, Nathan, “Relationship-Specificity, Incomplete Contract, and the Pattern of Trade,” Quarterly Journal of Economics, 122 (2007), 569–600. Portes, Richard, and H´el`ene Rey, “The Determinants of Cross Border Equity Flows,” Journal of International Economics, 65 (2005), 269–296. Rauch, James, “Networks Versus Markets in International Trade,” Journal of International Economics, 41 (1999), 7–35. Rauch, James, and Vitor Trindade, “Ethnic Chinese Networks in International Trade,” Review of Economics and Statistics, 84 (2002), 116–130. Reuben, Ernesto, Paola Sapienza, and Luigi Zingales, “Is Mistrust Self-Fulfilling?” University of Chicago, Working Paper, 2008. Reynolds, John, Bruce S. Weir, and C. Clark Cockerham, “Estimation of the Coancestry Coefficient: Basis for a Short-Term Genetic Distance,” Genetics, 105 (1983), 767–779. Rose, Andrew K., “One Money, One Market: Estimating the Effect of Common Currencies on Trade,” Economic Policy, 30 (2000), 9–45. Rose, Andrew K., and Tom D. Stanley, “A Meta-Analysis of the Effect of Common Currencies on International Trade,” Journal of Economic Surveys, 19 (2005), 347–365. Sapienza, Paola, Anna Toldra, and Luigi Zingales, “Understanding Trust,” NBER Working Paper No. 13387, 2007. Spolaore, Enrico, and Romain Wacziarg, “The Diffusion of Development,” Quarterly Journal of Economics, 124 (2009), 469–529. Stock, James H., and Motohiro Yogo, “Testing for Weak Instruments in Linear IV Regression,” NBER Technical Working Paper No. 284, 2002. Tabellini, Guido, “Culture and Institutions: Economic Development in the Regions of Europe,” IGIER Working Paper No. 292, 2007. ——, “The Scope of Cooperation,” Quarterly Journal of Economics, 123 (2008), 905–950. Vlachos, Jonas, “Does Regulatory Harmonization Increase Bilateral Asset Holdings?” Stockholm University, Working Paper, 2004.
ESTIMATING THE IMPACT OF THE HAJJ: RELIGION AND TOLERANCE IN ISLAM’S GLOBAL GATHERING∗ DAVID CLINGINGSMITH ASIM IJAZ KHWAJA MICHAEL KREMER We estimate the impact on pilgrims of performing the Hajj pilgrimage to Mecca. Our method compares successful and unsuccessful applicants in a lottery used by Pakistan to allocate Hajj visas. Pilgrim accounts stress that the Hajj leads to a feeling of unity with fellow Muslims, but outsiders have sometimes feared that this could be accompanied by antipathy toward non-Muslims. We find that participation in the Hajj increases observance of global Islamic practices, such as prayer and fasting, while decreasing participation in localized practices and beliefs, such as the use of amulets and dowry. It increases belief in equality and harmony among ethnic groups and Islamic sects and leads to more favorable attitudes toward women, including greater acceptance of female education and employment. Increased unity within the Islamic world is not accompanied by antipathy toward non-Muslims. Instead, Hajjis show increased belief in peace, and in equality and harmony among adherents of different religions. The evidence suggests that these changes are likely due to exposure to and interaction with Hajjis from around the world, rather than to a changed social role of pilgrims upon return.
I. INTRODUCTION We take advantage of a lottery used by Pakistan to allocate visas for the Hajj pilgrimage to understand the impact of the Hajj ∗ This paper has benefited from discussions with Alberto Abadie, May AlDabbagh, Tahir Andrabi, Ali Asani, Eli Berman, Amitabh Chandra, Lou Christillo, Jamal Elais, Carl Ernst, Raymond Fisman, Ed Glaeser, Bill Graham, Hamid Hasan, Sohail Hashimi, Zoe Hersov, Alaka Holla, Larry Iannaconne, Guido Imbens, Emir Kamenica, Ijaz A. Khwaja, Charles Kurzman, Erzo Luttmer, Brigitte Madrian, Rachel McCleary, Atif Mian, Rohini Pande, Barbara von Schlegel, Eldar Shafir, Nasim Sherazi, Tarik Yousef, Asad Zaman, and seminar audiences at Harvard University, Georgetown University, the University of Southern California, the London School of Economics, University College London, the Dubai School of Government, Stanford University, the University of California–Berkeley, Tufts University, Case Western Reserve University, University of Pennsylvania, University of Chicago, Columbia University, Ohio State University, the sixth ASREC conference, the NBER National Security Working Group and Political Economy Meetings, and the International Islamic University—Islamabad. Erin Baggott, Katalin Blankenship, Dan Choate, Alexandra Cirone, Benjamin Feigenberg, Martin Kanz, Supreet Kaur, Bilal Malik, Jeanette Park, and most notably Hisham Tariq provided excellent research assistance. We thank the editor, Ed Glaeser, the second editor, and three anonymous referees for comments. We gratefully acknowledge financial support for this project from the Spiritual Capital Research Program of the Metanexus Foundation and the Weatherhead Center for International Affairs at Harvard University. Khwaja thanks the Dubai Initiative at the Kennedy School, the William F. Milton Fund, and the HKS Dean’s Research Fund for financial support. We also thank Pakistan’s Ministry of Religious Affairs for graciously sharing data.
[email protected],
[email protected],
[email protected]. C 2009 by the President and Fellows of Harvard College and the Massachusetts Institute of
Technology. The Quarterly Journal of Economics, August 2009
1133
1134
QUARTERLY JOURNAL OF ECONOMICS
on pilgrims’ attitudes, beliefs, and practices. Our work sheds light on contemporary concerns regarding Islamic orthodoxy and extremism by showing that although the Hajj increases observance of orthodox Islamic practices, it also increases the desire for peace and tolerance toward others, both Muslims and non-Muslims. Our study also contributes to a broader literature on theories of social interaction and social identity and on the role of religious institutions. During five specific days of each year, more than two million Muslim men and women from over one hundred different countries gather in Mecca for the Hajj, often staying for over a month. Pilgrims mix across the lines of ethnicity, nationality, sect, and gender that divide them in everyday life. They affirm a common identity by communally performing identical rituals and dressing in similar garments that emphasize their equality. Numerous pilgrim accounts suggest that the Hajj inspires feelings of unity with the worldwide Muslim community (Wolfe 1997). Malcolm X performed the Hajj after breaking from the heterodox Nation of Islam to become a Sunni Muslim. In a letter from Mecca, he wrote, “There were tens of thousands of pilgrims, from all over the world. . . . We were all participating in the same ritual, displaying a spirit of unity and brotherhood that my experiences in America had led me to believe never could exist between the white and non-white. . . . [W]hat I have seen, and experienced, has forced me to rearrange much of my thought-patterns previously held, and to toss aside some of my previous conclusions” (X 1965, p. 346). Some have worried, though, that by promoting greater unity among Muslims, the Hajj could have negative implications for non-Muslims. After it emerged that some of the July 7 bombers of the London public transport system had undertaken the Hajj, the British intelligence services began monitoring pilgrims (Sunday Times 2007). Historically, colonial authorities also expressed similar concerns regarding the Hajj (Bose 2006; Low 2007). Others have expressed concern that the Hajj promotes a particular type of Islam. For example, Naipaul (1981) laments what he sees as the erosion of local religious traditions in South Asian Islam in favor of a more Saudi or Arab version of Islam. Of course, it is difficult to isolate the causal impact of the Hajj based on examples such as those of Malcolm X or the July 7 bombers. Those who choose to undertake the Hajj differ from those who do not, and the choice to do so may reflect other life changes. Thus, changes in pilgrims’ views and behavior after the Hajj may
ESTIMATING THE IMPACT OF THE HAJJ
1135
not reflect its impact. We estimate the effect of performing the Hajj by comparing successful and unsuccessful applicants to a lottery Pakistan uses to allocate its limited supply of Hajj visas. Because our survey included 1,600 Pakistani Sunni Hajj visa applicants and was conducted five to eight months after the completion of the Hajj, our results should be interpreted as isolating the mediumterm impact of performing the Hajj on this particular population. Our results support the idea that the Hajj helps to integrate the Muslim world, leading to a strengthening of global Islamic beliefs, a weakened attachment to local religious customs, and a sense of unity and equality with others who are ordinarily separated in everyday life by sect, ethnicity, nationality, or gender, but who are brought together during the Hajj. Although the Hajj may help forge a common Islamic identity, there is no evidence that this is defined in opposition to non-Muslims. On the contrary, the notions of equality and harmony appear to extend to adherents of other religions as well. These results contrast sharply with the view that increased Islamic orthodoxy goes hand in hand with extremism. We find that Hajjis (those who have performed the Hajj) are more likely to undertake universally accepted global Muslim religious practices such as fasting and performing obligatory and supererogatory (optional) prayers. In contrast, the Hajj reduces performance of less universally accepted, more localized practices and beliefs such as using amulets and the necessity of giving dowry. For example, the Hajj increases regular prayer in the mosque by 26% and almost doubles the likelihood of nonobligatory fasting. At the same time, it reduces the practice of using amulets by 8% and the South Asian belief according lower marriage priority to widows than to unmarried women by 18%. The evidence suggests that the Hajj increases tolerance both within the Islamic world and also beyond it. Hajjis return with more positive views toward people from other countries. Hajjis are also more likely to state that various Pakistani ethnic and Muslim sectarian groups are equal, and that it is possible for such groups to live in harmony. These views of equality and harmony extend to non-Muslims as well: Hajjis are 22% more likely to declare that people of different religions are equal and 11% more likely to state that adherents of different religions can live in harmony. We also find evidence that Hajjis are more peacefully inclined. For example, although few in our sample are willing to condemn the goals of Osama Bin Laden openly, Hajjis are almost twice as
1136
QUARTERLY JOURNAL OF ECONOMICS
likely to do so. Hajjis are also more likely to express a preference for peace with India and are 17% more likely to declare that it is incorrect to physically punish someone who has dishonored the family. There is little evidence that participating in the Hajj increases support for an increased role of religion in the state or politics, or that it induces negative views of the West. Hajjis are in fact less likely to believe that the state should enforce religious injunctions and that religious leaders should be able to dispense justice. Hajjis and non-Hajjis report similar views regarding the adoption of Western values and on the plausibility of Western/Jewish roles in the September 11 and July 7 terrorist attacks. The feelings of unity and equality brought about by the Hajj extend across gender lines to an extent. Hajjis report more positive views on women’s attributes and abilities. For example, they are 6% more likely to think women are spiritually better than men, an increase of over 50%. They also express greater concern about women’s quality of life in Pakistan relative to other countries and about crimes against women in Pakistan. Hajjis are also more likely to support girls’ education and female participation in the professional workforce. Hajjis show an 8% increase in their declared preference for their daughters or granddaughters to adopt professional careers. Male Hajjis show changes in views similar to those of female Hajjis. However, not all views on gender change. In particular, Hajjis are no more likely to question Islamic doctrine, such as unequal inheritance laws across gender, or to express views that potentially challenge male authority within the household, such as the correctness of a woman divorcing her husband. This suggests that Pakistani Hajjis’ altered views on women reflect a movement away from local prejudices against women and toward fairer treatment within Islam, rather than a more general trend toward feminism. Hajjis, primarily women, report lower levels of emotional and physical well-being. This may be due to the physically taxing nature of the Hajj rituals, as well as changed beliefs and greater awareness of the Muslim world outside Pakistan, particularly for women. Contrary to some of the historical literature on the Hajj (cf. Azarya [1978]; Donnan [1989]; Yamba [1995]), we do not find evidence in our sample of major changes, at least in the medium term, in the social role or engagement of Hajjis after their return.
ESTIMATING THE IMPACT OF THE HAJJ
1137
Although our study cannot definitively determine what drives the impact of the Hajj, further evidence suggests that the results are not driven by changes in pilgrims’ social roles upon return but rather reflect changes the pilgrim experiences during the Hajj, particularly exposure to Muslims from around the world. Hajjis gain experiential knowledge of the diversity of Islamic practices and beliefs, gender roles within Islam, and, more broadly, the world beyond Pakistan. Although the Hajj effects could be driven by a change in religious commitment, we do not find that Hajjis acquire greater formal religious knowledge. The Hajj’s impact on experiential knowledge and on some tolerant attitudes toward other groups tends to be larger for those traveling in smaller groups, who are more likely to have a broad range of social interactions with people from different backgrounds during the Hajj. Hajjis also show the largest positive gain in their views of other nationalities in relation to Indonesians, the non-Saudi group they are most likely to observe during the Hajj. Hajjis’ changed views toward women may also reflect an exposure channel, because the Hajj offers Pakistani pilgrims a novel opportunity to interact with members of the opposite gender in a religious setting, and to observe cross-gender interactions among Muslims from nations more accepting of such interactions. Our results shed light on contemporary concerns about Islamic orthodoxy and extremism. For many in the West the link is apparent: 45% of Americans believe Islam is more likely to encourage violence than other religions and close to one-third use negative words such as fanatic, radical, and terror to describe their impressions of Islam (PEW Forum 2007). It is noteworthy that although the Hajj leads to greater religious orthodoxy, it also increases pilgrims’ desire for peace and tolerance toward others, both Muslims and non-Muslims. Our results also connect to a broad, longstanding literature on social interaction and the shaping of beliefs and identity. Laboratory experiments suggest that group interactions exacerbate conflict in competitive settings and lessen it in cooperative ones (DeVries and Slavin 1978; Stephan 1978; Johnson and Johnson 1983; Aronson and Patnoe 1997; Slavin and Cooper 1999; Pettigrew and Tropp 2006). Drawing on a particular real-world setting, Boisjoly et al. (2006) report evidence that exposure to AfricanAmerican roommates generates more positive attitudes toward African-Americans among white students. In contrast, Fisman et al. (2008) find that exposure to a different race in youth makes
1138
QUARTERLY JOURNAL OF ECONOMICS
individuals less likely to prefer that race for a potential mate as adults. Although social identity theory suggests that strengthening attachment to an in-group may lead to negative feelings toward an out-group (Sherif et al. 1954; Tajfel 1970; Tajfel and Turner 1986), our evidence shows that Hajjis also positively update their views both toward groups to which they were exposed to and those they were not. Our findings also relate to a question in the sociology and economic modeling of religion about why religions often incorporate individually costly practices, and more broadly about the impact of religion on development (Iannacone 1992; Glaeser and Glendon 1998; Berman 2000; Sacerdote and Glaeser 2001; Barro and McCleary 2003; Guiso, Sapienza, and Zingales 2003). Putnam (2007) suggests that, in the U.S. context, religion may play a particularly important role as a “glue” that builds social capital. Our results suggest that the Hajj may play a role in contributing to the survival of Islam as a unified world religion. Over time, religions with far-flung adherents tend to evolve separate strands. Absent a central hierarchy in Islam, the Hajj may help bind the Islamic world together by moving Hajjis toward a common set of practices, making them more tolerant of others, and by creating a stronger shared identity. Further work would be needed to determine the extent to which our findings generalize beyond the specific context we examine. Our survey, conducted five to eight months after the Hajj, captures medium-term effects. Although it remains open whether these effects persist, we find few changes over the survey period. Further examination is also needed to determine the extent to which the results generalize beyond our sample of Pakistani Hajj lottery applicants. For example, the impact of the Hajj on gender attitudes may be smaller for pilgrims from countries with more liberal gender views. To the extent that the gender results reflect a convergence to the mean views of Hajjis, the Hajj may even induce more conservative views for pilgrims from these countries. Moreover, nearly half of lottery applicants are illiterate. Although this is fairly representative of Pakistan and a large number of Muslim countries, results could differ in more educated societies. Finally, we assess the impact of the Hajj using survey questions, several of which elicit self-reported beliefs and opinions. Additional work is needed to determine how the Hajj would impact pilgrims’ actions. The remainder of this paper is organized as follows. Section II gives some background on the Hajj, focusing on aspects that
ESTIMATING THE IMPACT OF THE HAJJ
1139
contextualize our findings. Section III lays out our statistical approach, outlines aspects of the visa application process that are important for our identification strategy, and gives details of the survey. Section IV presents the main empirical results on religious practice and belief, tolerance, gender, and well-being. Section V explores potential channels for the observed effects. Section VI concludes with some underpinnings and broader implications of our results. II. THE HAJJ EXPERIENCE The pilgrimage to Mecca is one of the five pillars of Islam and is obligatory for those with sufficient financial means. Many Hajjis describe it as the most significant religious event in their lives. Although the Hajj rituals last five days, many pilgrims stay longer. Most of the Hajjis in our sample report spending 40 days worshipping in the cities of Mecca and Medina.1 The Hajj is an inherently communal experience, in a religion that gives particular importance to communal rituals (McCleary 2007). Each ritual component of the Hajj is performed simultaneously with over two million participants. The focus is on individual practice rather than building religious knowledge. Moreover, each participant’s performance is believed to reinforce the others’, providing a shared aspect to individual worship. The Hajj engenders substantial mixing across national, sectarian, and gender lines in an atmosphere that emphasizes equality and unity. Pilgrims’ common identity is affirmed through common dress—a simple white garment known as the ihram— and the communal performance of standardized ritual practices.2 Men shave their heads, making them more similar in physical appearance, and those who complete the Hajj are entitled to use the honorific Hajji/Hajjin as a prefix to their name, further emphasizing their common identity. Although opportunities for in-depth intergroup interactions are limited both by language barriers and by the housing of pilgrims with their compatriots, the group nature of the experience makes observations of the contrasting practices and social dynamics of other groups even more salient. Close to two-thirds of the 1. See the Appendix for a timeline of the Hajj and an outline of the rituals and activities. 2. Men wear two white sheets. Women face less stringent requirements but typically also wear white.
1140
QUARTERLY JOURNAL OF ECONOMICS
Hajjis in our survey reported interacting with people from other countries frequently during the Hajj. The Hajj also involves more gender mixing than is typical among the Pakistani pilgrims we study. In Pakistan, interaction between men and women who are strangers is uncommon. Women rarely go to the mosque and when they do, they typically pray in a separate area from men. With equal numbers of male and female Hajjis (Bianchi 2004), such gender interactions are a natural part of the Hajj. Parties of pilgrims stay and move together for ease of planning and safety, and often include non–family members. Men pray alongside women, both Pakistani and non-Pakistani, during the Hajj. Our qualitative interviews revealed that these experiences were both very salient and unusual for Hajjis, and that most viewed them positively. Finally, Hajjis are also exposed to a degree of religious diversity within the recognized schools of thought, in a religiously sanctioned context in which all are accepted. The fourteenth-century Muslim explorer Ibn Battuta noted that because followers of different schools of Islam prayed together at Mecca, this often led to mixing of religious practices (Ibn Battuta 2002 [1355]). The Hajj is physically and financially taxing. Pilgrims travel over 80 km, much of it typically on foot. The extreme congestion heightens risks of injury and infectious disease (Ahmed, Arabi, and Memish 2006). Hajjis typically also sacrifice rest in order to maximize prayers during their stay. Participants in Pakistan’s Hajj lottery system pay about US$2,000 each for the trip, roughly two and one-half times Pakistan’s 2006 per capita GDP. The median respondent in our survey saved for the Hajj for over four years. III. METHODOLOGY: THE HAJJ LOTTERY AND SURVEY Because those who choose to perform the Hajj are likely to be motivated by a wide spectrum of unobservable and potentially time-varying factors, such as religious commitment and the desire for spiritual transformation, it is difficult to measure the impact of the Hajj by comparing Hajjis and non-Hajjis. We address this by taking advantage of a lottery that allocates Hajj visas. Because successful and unsuccessful lottery applicants are ex ante identical in expectation, we can use the lottery outcome as an instrument for whether someone performs the Hajj and isolate the Hajj impact from potential confounding factors. This section provides details
ESTIMATING THE IMPACT OF THE HAJJ
1141
of the Hajj lottery, our statistical methodology, and our survey process.3 III.A. The Hajj Lottery Process Historically, overcrowding on the Hajj has created logistical and safety problems. Saudi Arabia, on whose territory the Hajj takes place, therefore has established quotas for the number of Hajj visas available for each major Islamic country. For the January 2006 Hajj that we study, Pakistan’s total quota was 150,000 visas. Ninety thousand visas were allocated by the government, the majority (89%) by randomized lottery and the remainder by special quotas for the military and civil service (Organization of the Islamic Conference 2007). Applicants submitted a short form and deposited the Hajj fee at one of 1,559 bank branches across Pakistan between July 20 and August 15, 2005. The remaining 60,000 visas were allocated by private tour operators. As most Pakistanis are not eligible for the special quotas and the private operators are typically more expensive, the lottery is the primary source of visas for most Pakistanis. A total of 134,948 people were part of the government lottery with 59% successful.4 The Hajj lottery is conducted over parties of up to 20 individuals who will travel and stay together during the pilgrimage.5 Parties are formed either voluntarily, often along family lines, or by staff of the bank branches. Parties are assigned into separate strata for the two main Islamic sects (Sunni/Shia), eight regional cities of departure, and two types of accommodation that vary slightly in housing quality. A computer algorithm selects parties randomly from each stratum until the quota of individuals for that stratum is full. This process leads to a slightly lower chance of success for larger parties; if the selected party is larger than the remaining quota, it is set aside and another is randomly chosen from the remaining pool. The lottery selection algorithm was designed and implemented by an independent and reputable third party, and there were no reports of lottery manipulation. The rich and connected typically go through a private Hajj tour operator or the special 3. See Clingingsmith, Khwaja, and Kremer (2008) for a fuller description. 4. Excluding applicants automatically given visas because they applied unsuccessfully in the two preceding years. 5. In our survey sample, 34% of applicants were in a party with fewer than five people (only 1% were in a party of one), and 70% were in a party with ten or fewer people. The remaining 30% were in groups of size eleven to twenty.
1142
QUARTERLY JOURNAL OF ECONOMICS
quota rather than participating in the Hajj lottery. Consistent with the hypothesis of random assignment, success in the Hajj lottery is individually and jointly uncorrelated with applicant characteristics listed on the Hajj application forms, such as gender, marital status, year of birth, education, branch of application, and whether applicants listed a telephone number. A joint F-test fails to reject the null hypothesis of random assignment with a p-value of .98 (Table I, Panel A). With observations on 134,948 Hajj applicants, if the lottery were subject to influence, one would expect significant differences by characteristics such as education level, so this offers a pretty strong test. Among the successful Hajj applicants we surveyed, 99% went on the Hajj. Some unsuccessful lottery applicants secure a place with a private Hajj operator or through the special quota. Thus, 11% of those who were unsuccessful in the government lottery still performed the Hajj that year. Because compliance with the lottery is not perfect, we use success in the lottery as an instrumental variable to estimate the effect of performing the Hajj. This yields the local average treatment effect (LATE) for those for whom the outcome of the lottery determines Hajj participation. Our estimation equation is (1)
Yik = α k + β k Hajji + λc + εik,
where Hajji is an indicator variable for whether individual i performed the Hajj, Yik is the kth outcome of interest, and Hajji is instrumented by the individual’s lottery status. As long as success in the Hajj lottery only affects outcomes by inducing applicants to undertake the Hajj, this provides unbiased estimates of β k. Although we have no explicit way of ruling out a direct lottery effect, it seems unlikely, because our survey period was 8 to 11 months after the lottery. Potential direct effects, such as disappointment at not receiving a visa, are likely to be short-lived, especially given that individuals reapply. Moreover, such an effect would likely have led unsuccessful individuals to report greater distress, whereas we will show evidence to the contrary. Equation (1) also includes stratum-by-party size cell fixed effects λc , as the randomization was done within strata and there were slightly different chances of success depending on party size. However, because quotas for each departure city were proportional to the number of applications and the chance of success varied only slightly with party size, results are similar without
1143
ESTIMATING THE IMPACT OF THE HAJJ TABLE I RANDOMIZATION CHECKS
Applicant characteristic Female Application numbera Travel party numbera Year of birth Married Middle school High school Intercollege and up Branch of applicationa Provided phone number Constant Observations R2 Joint F-test of individual characteristics ( p-value)
Panel A
Panel B
Success in lottery
Success in Success in lottery lottery among among interviewed, interviewed restricted subsample
Coefficient (SE) Coefficient (SE) −0.001 (0.004) 0.001 (0.003) 0.005 (0.006) 0.000 (0.000) 0.009 (0.008) −0.001 (0.005) 0.000 (0.006) 0.002 (0.008) 0.005 (0.009) −0.001 (0.011) 1.142 (0.264) — .02 .98
−0.017 (0.022) −0.003 (0.015) 0.071 (0.068) 0.001 (0.001) −0.017 (0.063) −0.019 (0.037) −0.045 (0.046) −0.005 (0.052) −0.004 (0.042) 0.080 (0.060) −1.464 (2.499) 1,605 .06 .89
Coefficient (SE) −0.026 (0.024) −0.015 (0.016) 0.037 (0.072) 0.001 (0.001) 0.006 (0.073) −0.012 (0.041) −0.050 (0.051) −0.006 (0.060) 0.000 (0.000) 0.094 (0.064) −2.481 (2.689) 1,295 .10 .81
Notes. Robust standard errors in parentheses clustered at the party level. Regressions include dummies for place of departure × accommodation category × party size category. a Application number is in units of 100,000; travel party number is in units of 10,000; branch code is in units of 1,000. ∗ Significant at 10%. ∗∗ Significant at 5%. ∗∗∗ Significant at 1%.
the cell dummies. Standard errors are clustered at the party level, because outcomes for people traveling together may not be independent. Both for ease of exposition and to lessen data-mining concerns, we present our results using thematic indices that are constructed by grouping related questions. For example, for views on
1144
QUARTERLY JOURNAL OF ECONOMICS
female education, we construct an index that combines questions about whether girls should receive education, what level of schooling girls should receive, etc. Although results on any component question could potentially be due to chance (Type I error), this is less likely when one simultaneously considers several related questions in an index. Moreover, the use of indices reduces the risk of low statistical power (Type II error). We compute the average effect size (AES) across outcomes (components) within an index following O’Brien (1984) and Kling et al. (2004).6 For a family of J related outcomes Y j in an index, with Hajj local averπ age treatment effects π j , the average effect size is τ = J1 Jj=1 σ jj , where σ j is the standard deviation of outcome j in the comparison group.7 We report standard tests of the null hypothesis of no effect for each individual index. However, because we have 25 indices, we also show that our results are robust to multiple hypothesis testing by using a conservative Bonferroni–Holm test, which makes no assumptions about the correlation of hypotheses. We can reject the null hypothesis of one or more false positives with an α of 0.07. Less conservative methods, such as specifying an acceptable false discovery rate, would make this result even stronger.
III.B. The Survey We surveyed successful and unsuccessful applicants to the 2006 Hajj lottery five to eight months after the Hajj.8 The survey includes questions on religious knowledge and practice, tolerance, views on gender, social interaction and roles, political involvement and beliefs, physical and mental health, and business and employment, as well as background information on the household and its members. 6. Results are similar with indices that average over component questions (see the working paper version). 7. To test for τ against the null hypothesis of no average effect, we account for the covariance between the effects π j by jointly estimating the π j in a seemingly unrelated regression framework. We stack the J outcomes and use our treatment effects regression fully interacted with dummy variables for each outcome as the right-hand side. The coefficients π j are the same as those estimated in the outcome-by-outcome regressions. Our stacked regression now gives us the correct covariance matrix to form a test of τ . 8. Conducting a baseline survey was infeasible because the lottery took place less than a month after applicant data were available. Surveys after the lottery would not constitute a valid baseline because the successful applicants were preparing to leave and differentially affected.
1145
ESTIMATING THE IMPACT OF THE HAJJ TABLE II SUMMARY STATISTICS Adult Pakistani population (restricted > 20 years old) Characteristic
Mean
Age 40.16 Female 0.499 Married 0.703 Illiterate 0.482 Intercollege and up 0.201 Citya Periurban/ large villagea Rurala Ballot success Monthly expenditures 8.678 (log)
Std. dev. 16.244 0.500 0.497 0.458 0.43
0.641
Full sample
Restricted subsample
Mean Std. dev. Mean Std. dev. 54.575 0.490 0.943 0.402 0.178 0.400 0.274
13.240 0.500 0.232 0.490 0.383 0.490 0.460
55.039 0.496 0.948 0.417 0.157 0.372 0.293
13.246 0.500 0.222 0.493 0.364 0.483 0.455
0.325 0.533 8.832
0.470 0.499 0.783
0.335 0.524 8.896
0.472 0.500 0.726
Notes. N = 1,605 for full sample, N = 1,295 for subsample, and N = 29,995 for adult Pakistani population. The Pakistani adult population is from the MICS 2003–4 survey (restricted to the same districts as in our sample). a City, periurban, and rural classifications comparable to our survey data are not available in the MICS.
The initial sampling frame was the list of all Hajj lottery applicants obtained from the Ministry. The survey area was limited for logistic ease to nine administrative districts in the Punjab province.9 Surveyors used addresses and telephone numbers provided in the applications to locate applicants and interview them at their residences. The sample was also restricted to Sunni applicants, because there were too few Shia applicants for meaningful inferences to be drawn. To maximize statistical power, we randomly selected equal numbers of winning and losing parties. Within each party, we randomly selected an individual to interview, and, if other party members of opposite gender were identified as living with the individual, we also selected a second person of the opposite gender. Surveyed applicants are broadly representative of the adult Pakistan population (Table II) with some truncation of the extremes of the socioeconomic distribution, because the poorest cannot afford to go on the Hajj and the rich typically travel on private schemes. Hajj applicants have average education and household 9. The districts were Attock, Islamabad, Rawalpindi, Jhelum, Chakwal, Faisalabad, Sargodha, Multan, and Gujrat.
1146
QUARTERLY JOURNAL OF ECONOMICS TABLE III SURVEY COMPLETION STATISTICS Panel A: Full sample
Panel B: Restricted subsample
Lottery status
Lottery status
Characteristic
Total Successful Unsuccessful Total Successful Unsuccessful
Selected for interview Raw completed interviews Completion rate (%) Not completed (%) Dead/ill Lives elsewhere Not found Not home Refused
2,537
1,286
1,251
1,995
1,032
963
1,605
855
750
1,295
679
616
63.3
66.5
60.0
64.9
65.8
64.0
2.1 10.4 8.3 7.9 7.9
2.1 10.0 6.4 8.7 6.3
2.1 10.8 10.3 7.2 9.6
2.3 9.8 7.7 8.2 7.2
2.2 9.8 6.6 8.7 6.9
2.3 9.8 8.9 7.6 7.5
Notes. Interview completion percentages from surveyor reports.
expenditures similar to those for the general population, but are older and more likely to be married. Forty percent are from cities, fairly similar to the general population. Surveyors completed interviews with 1,605 applicants, 63% of the 2,537 they attempted to interview (Table III, Panel A). However, only 7.9% of the attempted interviews were refused. In about three-quarters of unsuccessful attempts, surveyors were unable to contact or locate applicants. Some applicants lived in a different (out-of-sample) district from the one provided in their application address (often a relative’s address they wished to travel with), and it was not logistically possible to survey them. In other cases, addresses were incomplete or incorrect or the applicant was not at home despite three separate attempts. Among applicants the survey team could contact (i.e., interviewed plus refusals), the survey completion rate was therefore 88.8%. Successful applicants completed the survey at a 66.5% rate, higher than the 60.0% rate for unsuccessful applicants. This difference is statistically significant at the 1% level (Table III, Panel A). Hajjis were easier to locate, perhaps because their participation in the Hajj made them better known in their localities. Successful applicants also had a slightly lower refusal rate, possibly because they regarded the survey as being more pertinent for those who had actually performed the Hajj.
ESTIMATING THE IMPACT OF THE HAJJ
1147
The unbalanced interview completion between successful and unsuccessful lottery applicants could potentially introduce selection and bias our estimate of the Hajj effect.10 Therefore, we provide three robustness checks against selection concerns. First, Table I, Panel B, shows that for completed interviews, lottery success is not individually or jointly correlated with observable applicant characteristics. Second, our results are robust to demographic controls. None of our 25 index results qualitatively change with controls for district, urban or periurban location, and individual characteristics.11 Finally, we examine the robustness of our results to a restricted subsample (Table II) that excludes nine out of the 49 tehsils (subdistricts) in our survey area that were particularly difficult to survey. This subsample is balanced on survey completion and reasons for noncompletion. It excludes tehsils with more than 25 selected applicants (tehsils with smaller samples may generate imbalance mechanically) where the completion rate for successful applicants exceeded that for unsuccessful ones by more than 7%. This subsample contains 81% of the total interviews. Although the completion rate was still somewhat higher for successful applicants (65.8% vs. 64.0%), we fail to reject the null hypothesis of an identical completion rate with a p-value of .66. As in the full sample, lottery success in the interviewed subsample is uncorrelated with applicant characteristics (Table I, Panel B). As our results below show, there is no qualitative change in our estimates in the subsample. IV. MAIN RESULTS This section presents our main results on the impact of the Hajj. Sections IV.A–IV.D examine religious behavior and practices, tolerance, gender attitudes, and well-being, respectively. Our 10. A selection effect would imply that the marginal surveyed successful applicant was less willing to give an interview (more uncooperative) and harder to locate. This is because the initial randomization guarantees that successful and unsuccessful applicants are distributed identically along any attribute. If selection is introduced by, for example, successful applicants gaining incremental visibility from traveling, then the marginal successful applicant found is slightly less well known ex ante than the marginal unsuccessful applicant found. However, it is not clear how such potential selection could generate several of our results, such as a shift from localized to global practice or increased tolerance. If anything, one may expect the opposite for the tolerance result, because selection implies that the interviewed successful applicant is marginally less cooperative. 11. We can use additional data such as assets and expenditure from survey data, and the results are robust to these as well. We prefer not to present these as primary controls due to their potential endogeneity.
1148
QUARTERLY JOURNAL OF ECONOMICS TABLE IV RELIGION AES coefficients
(1) Regarded as religious (2) Global Islamic practice (3) Belief in localized Muslim practices (4) Participation in localized Muslim practices
Base
Controls
Restricted subsample
0.238∗∗∗ (0.06) 0.163∗∗∗ (0.030) −0.101∗∗∗ (0.032) −0.097∗∗ (0.046)
0.230∗∗∗ (0.055) 0.166∗∗∗ (0.029) −0.094∗∗∗ (0.031) −0.097∗∗ (0.045)
0.258∗∗∗ (0.061) 0.171∗∗∗ (0.033) −0.074∗∗ (0.035) −0.085∗ (0.052)
Notes. Columns give AES estimates for our base, control, and restricted subsample specifications. The AES averages the normalized treatment effects obtained from a seemingly unrelated regression in which each dependent variable is a question in the index. All regressions include dummies for place of departure × accommodation category × party size category, as well as dummies for each of the nine districts in the survey. All results come from IV regressions where the instrument is success in the Hajj lottery. Standard errors in parentheses clustered at the party level: Index component questions with number of components indicated in parentheses: Index 1 (1): Do others regard you as religious? Index 2 (10): How frequently do you: pray, do tasbih after prayer, pray in the mosque? Did you pray in the mosque last Sunday? Do you pray optional night prayers? Can you read the Qu’ran? How frequently do you: read the Qu’ran? discuss religious matters? keep fast during Ramadan? keep fast outside Ramadan? Index 3 (10): What is your general view of holy men? Do you regard: visiting holy men as correct? visiting shrines? using amulets? doing a forty-day death ceremony? participating in maulad mehfil (special religious gathering)? Do you believe that: a cap is required for prayer? that dowry is mandatory? that widows have different priority in remarriage? that there can be intercession on Judgment Day? Index 4 (4): Do you actively visit holy men? visit shrines? use amulets? participate in maulad mehfil? ∗ significant at 10%; ∗∗ significant at 5%; ∗∗∗ significant at 1%.
power to detect interaction effects is limited, so we generally do not present interaction results, except in cases where we have strong priors and reasonable power and consistency, as in the case of gender. Our estimates capture the effect of the Hajj five to eight months after pilgrims return. Although this limits our ability to explore persistence of the effects, we do not find any significant changes over the survey period. The rows of Tables IV–VIII present the average effect size (AES) estimates for each index, including the control and restricted subsample specifications. Because the results are very similar, we focus on the base specification. The component questions in each index are described in the notes to the tables. Table IX further presents results for several index component questions of individual interest. A supplemental Online Appendix presents the Hajj impact estimates and definition details for all the component questions.
ESTIMATING THE IMPACT OF THE HAJJ
1149
TABLE V TOLERANCE AES coefficients
(1) Views of other countries (2) Views of other groups (3) Harmony (4) Peaceful inclination (5) Political Islam index (6) Views of West
Base
Controls
Restricted subsample
0.150∗∗∗ (0.04) 0.131∗∗∗ (0.05) 0.128∗∗∗ (0.04) 0.111∗∗∗ (0.03) −0.050 (0.04) 0.029 (0.04)
0.147∗∗∗ (0.04) 0.108∗∗ (0.05) 0.117∗∗∗ (0.04) 0.121∗∗∗ (0.03) −0.044 (0.03) 0.039 (0.04)
0.151∗∗∗ (0.04) 0.122∗∗ (0.06) 0.126∗∗∗ (0.05) 0.128∗∗∗ (0.04) −0.043 (0.04) 0.011 (0.04)
Notes. See notes to Table IV. Index component questions with number of components indicated in parentheses: Index 1 (6): General view of people from other countries, positive to negative: Saudis, Indonesians, Turks, African, Europeans, Chinese. Index 2 (3): How do members of the following groups compare to your group: different sect? different religion? different ethnicity? Index 3 (4): Do you believe the following groups can live in unity and harmony through compromise over disagreements: sects of Islam? religions? Pakistani ethnic groups? Do you ever pray in a mosque of a different school of thought? Index 4 (8): Belief in incorrectness of: Osama’s goals? Osama’s methods? How important is peace with India for Pakistan? Should the current India/Pakistan boundary be the permanent border if this leads to peace? Should Pakistan not support/only partly support those fighting the Indian government in Kashmir? How incorrect are: suicide attacks? attacks on civilians in war? physical punishment of someone who dishonors family? Index 5 (5): Agree that: government should enforce Islamic injunctions? religious leaders have right to dispense justice? religious leaders should have direct influence on government? better for politicians/officials to have strong religious beliefs? religious beliefs important in voting for candidate? Index 6 (4): Is it bad for Pakistanis to adopt: Western social values? Western technology? Believe there was Western/Jewish role in 9/11 and 2005 London bombing? Believe West does not take into account interests of countries such as Pakistan?
IV.A. Religious Practices and Beliefs Hajjis are 13% more likely to report they are regarded as religious persons, a one-fourth standard deviation increase relative to the control group (Table IV, row (1)). Three indices explore how the Hajj affects religious practice and belief. The first measures global Islamic religious practice, meaning the performance of rites universally acknowledged within the Muslim world. Questions, described in the notes to Table IV, include the applicant’s observance of prayer, fasting, and Qur’anic recitation, etc. The Hajj increases the global religious practice index by 0.16 standard deviations (row (2)). This is a fairly large effect, particularly because it reflects practice five to eight months post-Hajj, and not the fervor of a recently returned
1150
QUARTERLY JOURNAL OF ECONOMICS TABLE VI GENDER AES coefficients
(1) Views toward women (2) Women’s quality of life (3) Girls’ education (4) Women in workforce/professions (5) Gender authority
Base
Controls
0.120∗∗∗ (0.04) 0.158∗∗∗ (0.05) 0.092∗∗ (0.04) 0.119∗∗∗ (0.04) −0.005 (0.02)
0.116∗∗∗ (0.04) 0.138∗∗∗ (0.05) 0.089∗∗ (0.04) 0.112∗∗∗ (0.04) −0.010 (0.02)
Restricted subsample 0.139∗∗∗ (0.04) 0.166∗∗∗ (0.06) 0.097∗∗ (0.04) 0.091∗∗ (0.04) 0.005 (0.03)
Notes. See notes to Table IV. Index component questions with number of components indicated in parentheses: Index 1 (4): How do men and women compare: mentally/intellectually? spiritually? morally/ethically? Are men and women equal? Index 2 (5): Opinion of quality of women’s lives in following countries/regions relative to Pakistan: Saudi Arabia, Indonesia/Malaysia, West. Think too many crimes against women in Pakistan: overall? relative to men? Index 3 (5): Should girls attend school? Until what level would permit attendance at coeducational schools for: girls? boys? Until what level should coeducational schools be allowed? How many years should girls study relative to boys? Index 4 (3): Like daughters/granddaughter to work? Like a professional occupation for daughters/granddaughters? Good employment important for daughter/granddaughterin-law? Index 5 (7): Women better at managing daily affairs? Wives have equal say in deciding number of children? Is it sometimes correct for: woman to divorce husband? marry against parents wishes? When jobs scarce men should not have more right to one than women? Should daughter have equal inheritence share? Do women count equally to men as witnesses?
pilgrim. The Hajj nearly doubled the rate of regular fasting outside of Ramadan (the obligatory month of fasting) to around 9% and increased praying Tahajjud (supererogatory) prayers by twothirds (Table IX, rows (2) and (3)). In most Muslim countries, there are a variety of Islamic traditions that are not as universally accepted as the global practices examined above. Some of these are specific to particular countries or regions. The Hajj rituals highlight global practices. Local practices might decline because they compete for time and attention with global practices, or because the Hajj induces a shift in belief. We find evidence of an absolute shift away from local beliefs and practices. Although most pilgrims initially have moderately high levels of local beliefs, the Hajj leads to a 0.10–standard deviation reduction in an index of localized beliefs that are fairly common in South Asia but not among Muslims globally (Table IV, row (3)). Some practices, such as visiting the tombs of saints and using amulets, have roots in local Sufi traditions. Others reflect
ESTIMATING THE IMPACT OF THE HAJJ
1151
TABLE VII WELL-BEING Panel A: AES coefficients Base
Restricted Controls subsample
−0.206∗∗∗ −0.206∗∗∗ −0.200∗∗∗ (0.05) (0.05) (0.05) (2) Positive feelings −0.109∗∗ −0.098∗∗ −0.079 (0.05) (0.04) (0.05) (3) Index of satisfaction −0.010 0.006 0.011 with life and finances (0.04) (0.04) (0.04) (4) Self-rated physical −0.213∗∗∗ −0.219∗∗∗ −0.239∗∗∗ health (0.05) (0.05) (0.06)
(1) Rescaled K6 index
Panel B: AES Main effect −0.369∗∗∗ (0.08) −0.149∗∗ (0.07) −0.028 (0.05) −0.320∗∗∗ (0.07)
Male interaction 0.326∗∗∗ (0.09) 0.079 (0.08) 0.036 (0.08) 0.210∗∗ (0.10)
Notes. See notes for Table IV. In addition, note that Panel A gives AES estimates for our base, control, and restricted subsample specifications, whereas Panel B adds an interaction between Hajj participation and a male variable to the base AES specification. In Panel B, the instruments are success in the Hajj lottery for the main effect and success interacted with male in the interaction specification. Index component questions with number of components indicated in parentheses: Index 1 (6) [rescaled, high value=less distress]: During the past 30 days, how often did you feel: nervous? hopeless? restless or fidgety? so depressed that nothing could cheer you up? everything was an effort? worthless? Index 2 (5): During the past 30 days, how often did you feel: relaxed and peaceful? content? joyous? How much pleasure do you take in life? Altogether, are you very happy/not at all happy (four-point scale)? Index 3 (3): How satisfied with life as a whole are you (ten-point scale)? How much room for improvement in your quality of life? How satisfied are you with finances (ten-point scale)? Index 4 (2): How good is your physical health (four-point scale)? Have you been free of any 7+ day illness/injury in the past year?
local interpretation of Islamic doctrine, such as giving dowry (Islam instead emphasizes mehr, where a man commits to pay his wife in case of divorce) and what remarriage priority should be accorded to widows. Whereas South Asian women often lose status when their husbands die and have little prospect of remarriage, in Islam a widow can readily remarry after a short waiting period. The Hajj similarly reduces an index of localized religious practice, related mainly to the Sufi traditions mentioned above, by 0.10 standard deviations (Table IV, row (4)). As we noted earlier, some have expressed concern about the erosion of local South Asian traditions. We later present evidence suggesting that the Hajj does not produce a shift in favor of a Saudi version of Islam but rather a move toward the global mainstream. IV.B. Tolerance We find that Hajjis display more positive views toward other nationalities and social groups, have greater tolerance, and are more peacefully inclined (Table V).
1152
QUARTERLY JOURNAL OF ECONOMICS TABLE VIII ENGAGEMENT AND EXPOSURE AES coefficients
(1) Socioeconomic engagement (2) Engagement in politics (3) Formal knowledge of Islam (4) Diversity knowledge (5) Gender knowledge (6) Global knowledge
Base
Controls
Restricted subsample
−0.002 (0.02) −0.011 (0.03) 0.004 (0.04) 0.146∗∗∗ (0.04) 0.125∗∗∗ (0.04) 0.083∗∗ (0.04)
−0.008 (0.02) −0.010 (0.03) 0.000 (0.03) 0.139∗∗∗ (0.04) 0.116∗∗∗ (0.03) 0.086∗∗ (0.03)
0.011 (0.02) −0.024 (0.03) −0.003 (0.04) 0.133∗∗∗ (0.05) 0.104∗∗ (0.04) 0.072 (0.05)
Notes. See notes for Table IV. Index component questions with number of components indicated in parentheses: Index 1 (15): How frequently do you visit: people in your town/village? people outside your town/village? How frequently are you visited by: people in your town/village? people from outside your town/village? How many times in the past year have close family/friends sought advice on: family matters? religious matters? business matters? How many times in the past year have more distant family/friends sought advice on: family matters? religious matters? business matters? Are you a member of following kinds of organizations: religious, professional, school? Do you work as: an employee? for yourself? Index 2 (7): Did you vote in last election? How interested in national affairs? Are you member of political party? Are you a member of a political organization? A social organization? How often do you follow national affairs? Do you have an opinion on how politicians are handling national affairs? Index 3 (10): Name as many of the five pillars of Islam as you can. Correct answer to: How many chapters in the Qu’ran? Can you recite favorite verse of the Qu’ran? What is shortest sura of the Qu’ran? What is longest? How many suras are in the the Qu’ran? What is first revealed verse of the Qu’ran? Is method of prayer described in the Qu’ran? What is percentage required to be given as Zakat (charitable tax)? How long must wealth be held for Zakat to be due? Index 4 (3): Correct answers to: How many accepted schools of thought in Sunni Islam? Is a cap required for prayer? Is saying “talak, talak, talak” sufficient for legal divorce? Index 5 (8): Correct answers to: What was name of prophet’s first wife? How many wives is a man allowed at once? Can a Muslim man marry a Jewish or Christian woman? Is dowry mandatory? Further: Have you heard of Islamic law relating to adultery? Do you have an opinion about women’s lives in: Saudi Arabia? Indonesia/Malaysia? West? Index 6 (6): How many countries share a border with Pakistan? What country has largest percentage Muslim? What percentage of Nigerians are Muslim? What are world’s two most populous countries? Who is the Prime Minister of India? Which is further from Pakistan, England or the United States?
The Hajj increases an index of positive views about people from other countries by 0.15 standard deviations or more than 33% (Table V, row (1)). Hajjis update their beliefs most positively about nationalities they are likely to interact with frequently. The largest positive impact (0.32 standard deviations) is on views toward Indonesians (Table IX, row (4)), the largest non-Saudi pilgrim group and the one Hajjis report as observing the most. Hajjis also have a 0.14–standard deviation more positive view of Saudis (Table IX, row (5)). There is no effect on views of Europeans. Hajjis are also significantly more likely to declare that Indonesians
(10) Do you believe the methods Osama uses in fighting are correct?
(6) In your opinion, overall how are people of a different religion compared to your people? (7) Do you believe that people of different religions can live in unity & agreement (harmony) in a given society by making agreements over their differences? (8) Do you ever pray in the mosque of a different maslak than your own? (9) Do you believe the goals for which Osama is fighting are correct?
(5) Is your general view of Saudi people:
(3) How often did you fast outside of Ramadan during the past year? (4) Is your general view of Indonesian people:
(1) Do you believe others regard you as religious? (2) Do you pray “Tahajjud Namaz”?
Question
.021 .014
.112
0.084 0.063
0.034 0.063
0.051
1 = Yes, 0 = No Binary: 1 = Frequently, 0 = Less often/never 1 = Not correct at all/slightly incorrect, 0 = Correct/absolutely correct 1 = Absolutely never/almost never correct, 0 = To small extent/some extent/strongly correct
.074
.004
.026
.000
0.217 0.110
.006
0.041
.000 .000
0.100 0.184
1 = Religious, 0 = Not religious 1 = Yes (regularly, occasionally), 0 = No (rarely, never) 1 = Several times per month or more, 0 = Once per month or less 2 = Very positive, −2 = Very negative 2 = Very positive, −2 = Very negative 0 = Better or worse, 1 = Same
p-value
Coef.
Coding
TABLE IX SELECTED SURVEY QUESTIONS
0.159
0.068
0.049
0.589
0.389
1.034
0.362
0.049
0.772 0.281
Comp. mean
761
761
1,463
1,270
1,604
1,593
1,583
1,605
1,541 1,605
Obs.
.063
.054
.027
.036
.025
.026
.055
.030
.033 .047
R2 ESTIMATING THE IMPACT OF THE HAJJ
1153
(17) Do you think there are too many crimes against women in Pakistan? Overall
(16) What is your opinion about the quality of women’s lives in each of the following countries/regions? West
(15) What is your opinion about the quality of women’s lives in each of the following countries/regions? Saudi Arabia
(11) How important do you believe peace with India is for Pakistan’s future? (12) Please tell me what you think about the correctness of the following: family members physically punishing someone who has dishonored the family (13) In your opinion, how do men and women compare to each other with respect to the following traits: spiritually (14) What is your opinion about the quality of women’s lives in each of the following countries/regions? Indonesia/ Malaysia
Question
.145
.051
0.057
0.094
0.051
0.087
0 = Men are better/equal, 1 = Women are better 1 = Greater than in Pakistan, 0 = Lower than or equal that in Pakistan; Base variables 5 = Very high, 1 = Very low 1 = Greater than in Pakistan, 0 = Lower than or equal that in Pakistan; Base variables 5 = Very high, 1 = Very low 1 = Greater than in Pakistan, 0 = Lower than or equal that in Pakistan; Base variables 5 = Very high, 1 = Very low Binary: 0 = No, 1 = Yes 0.052
.088
0.044
0 = Correct, 1 = Never correct
.075
.006
.112
.016
0.044
1 = Important, 0 = Not important
p-value
Coef.
Coding
TABLE IX (CONTINUED)
0.597
0.186
0.322
0.262
0.111
0.261
0.913
Comp. mean
1,605
646
1,180
551
1,497
1,459
1,155
Obs.
.045
.091
.048
.058
.034
.033
.020
R2
1154 QUARTERLY JOURNAL OF ECONOMICS
Coding .052
.039 .036
.024
.156
.073
0.028 0.055
0.059
0.045
0.054
p-value
0.053
Coef.
0.457
0.540
0.729
0.933 0.722
0.171
Comp. mean
R2
1,562 .028
1,605 .029
1,550 .036
1,604 .027 1,550 .035
1,135 .026
Obs.
Notes. Rows contain results from individual IV regressions where the instrument is success in the Hajj lottery, and which include include dummies for place of departure × accommodation category × party size category. p-values are corrected for clustering at the party level.
(18) Do you think there are too many crimes against 1 = Against women score < against women in Pakistan? Relative to men men score, 0 = Against women score ≥ against men score; Base scores 1 = Yes, a lot; 4 = No, not at all (19) In your opinion, girls should attend school Binary: 0 = Disagree, 1 = Agree (20) Until what level would you prefer allow/permit girls 0 = Never, 1 = Primary, secondary, or in your family to attend coeducational schools (boys all levels and girls in the same school)? (21) Until what level would you prefer allow/permit boys 0 = Never, 1 = Primary, secondary, or in your family to attend coeducational schools (boys all levels and girls in the same school)? (22) Would you like for your daughters or female grand- 0 = No, 1 = Yes children to have a career other than caring for the household? (23) How important are the following characteristics in 0 = Not important, 1 = Important your son’s, grandson’s wife?: Good employment or business
Question
TABLE IX (CONTINUED)
ESTIMATING THE IMPACT OF THE HAJJ
1155
1156
QUARTERLY JOURNAL OF ECONOMICS
are the best practitioners of Islam (regression not reported). In follow-up open-ended interviews, Pakistani Hajjis also reported positive interactions with Indonesians. For example, one older female Hajji said, “I had a very good experience with female Hajjis from Indonesia. They would make space for me whenever I was walking if I gestured for them to do so. One of them even gave me Vicks VapoRub when she found out that I had the flu.” The Hajj also increases an index of beliefs that adherents of different sects, ethnicities, and religions are equal by 0.13 standard deviations (Table V, row (2)). In contrast to the views on different nationalities, the largest move toward equal status is for people of a different religion (Table IX, row (6)), who would not be encountered during the Hajj, as they aren’t permitted to attend. Hajjis may thus be willing to extend their notions of tolerance beyond the Muslim world. Similarly, the Hajj increases an intergroup harmony index by 0.13 standard deviations (Table V, row (3)). The index solicits applicants’ views on whether people from different ethnic groups, Islamic sects, and religions could live together in harmony in the same society. It also includes a practice-based question about how frequently the respondent prays in a mosque of a different school of thought. The effect is largest for religion, about which the control group has the lowest belief regarding harmony (Table IX, row (7)). The effect on the respondent praying in a mosque of a different school of thought is also large, almost doubling the control group mean of 4.9% (Table IX, row (8)). We complement the harmony index by exploring the extent to which the Hajj leads to greater inclination to peace. The Hajj increases a peaceful inclination index by 0.11 standard deviations (Table V, row (4)). Examining some of the component questions, we find that the Hajj almost doubles the number of respondents who declare that Osama bin Laden’s goals are incorrect, from 6.8% to 13.1%, and increases the fraction declaring his methods incorrect from 16% to 21% (Table IX, rows (9) and (10)).12 The Hajj increases the belief that peace with India is important from 91% to 96% (Table IX, row (11)). Hajjis are also 17% more likely to say it is never correct to physically punish someone who has dishonored the family (Table IX, row (12)). Although these results 12. Slightly more than half say his goals are correct; one-third say his methods are correct; quite a few do not answer.
ESTIMATING THE IMPACT OF THE HAJJ
1157
are consistent with becoming more tolerant, it is also possible that the Hajj confers religious legitimacy on individuals that allows them to more willingly express previously held views. One might suspect that more orthodox religious practice could be associated with support for political Islam, which advocates a closer relationship between religion and politics and which is often associated with negative perceptions of the West. We see no increase in belief either in the role of religion in politics or in more negative views of the West. It is nonetheless possible that increased tolerance tempers such desires and perceptions, if they do in fact go along with increased orthodoxy. The Hajj reduces support for political Islam, although the effect is only weakly significant at 15% (Table V, row (5)). The political Islam index includes questions on how deeply religion should be involved in politics. Although the average respondent is likely to see a role for religion in matters of the state, Hajjis are no more likely to do so in spite of an increased attachment to global Islam. In fact, the Hajj significantly reduces beliefs that the state should enforce religious injunctions and that religious leaders should be able to dispense justice on their own. The Hajj does not lead to any increase in an index of negative attitudes toward the West (Table V, row (6)) that encompasses views on adopting Western social values and technologies and commonly held suspicions toward the West. We can reject a negative effect of one-twentieth of a standard deviation with 95% confidence. We find no evidence that the Hajj affects the tails of the distribution of attitudes toward political Islam and views toward the West. Moreover, although young people may potentially be more susceptible to intolerance, the six tolerance indices examined don’t show any differential Hajj effect for younger pilgrims (regressions not reported). If anything, the harmony index shows a more positive effect for the young. IV.C. The Hajj and Gender We noted earlier that the Hajj may provide Pakistani pilgrims with a novel opportunity in which men and women interact, perform rituals as equals, and observe the gender roles of other nationalities. Perhaps on account of this, we find that the Hajj causes a 0.12–standard deviation increase in an index of questions about the status of women relative to men along intellectual, spiritual,
1158
QUARTERLY JOURNAL OF ECONOMICS
and moral dimensions (Table VI, row (1)). The effect is largest on the spiritual dimension, with an increase of over 50% (from 11% to 17%) in belief that women are better (Table IX, row (13)). Hajjis are also more likely to believe that, although there are gender differences, women’s overall status is equal. The Hajj also increases an index that captures awareness of women’s quality of life issues in Pakistan by 0.16 standard deviations (Table VI, row (2)). The index includes respondents’ ratings of women’s quality of life in other countries relative to Pakistan. The largest effect is on the relative quality of life of Indonesian/Malaysian women being higher, paralleling the previous results on views of other nationalities (Table IX, row (14)). Interestingly, Hajjis show a greater increase in their views on the relative quality of life of women in the West compared to Saudi Arabia (Table IX, rows (15) and (16)). In addition, Hajjis are also more likely to think that crimes against women are high, both on an absolute scale and relative to crimes against men (Table IX, rows (17) and (18)). Do the more favorable assessment of women’s qualities and the greater concern regarding their quality of life in Pakistan go along with a changed view of the role women ought to take in society? We construct three indices to explore the areas of girls’ education, women’s workforce participation and choice of professions, and the willingness to challenge the authority of men relative to women within the household and in social contexts (Table VI, rows (3)–(5)). The Hajj increases favorable views toward education for girls by about 0.09 standard deviations (Table VI, row (3)). We find positive Hajj effects on all components except equal educational attainment across gender. The Hajj increases the desire that girls attend school from 93% to 96% (Table IX, row (19)). Hajjis are 8% more willing to allow both their boys and girls to attend coeducational schools at all levels (Table IX, rows (20) and (21)). Further, the Hajj increases an index of questions about women’s workforce participation and profession choice by 0.12 standard deviations (Table VI, row (4)). The Hajj has a substantial impact on each index component. For example, the Hajj increases the fraction desiring that their daughters/granddaughters work from 54% to 60% (Table IX, row (22)). Hajjis are also 12% more likely to think it is important that their future daughter-in-law be employed (Table IX, row (23)).
ESTIMATING THE IMPACT OF THE HAJJ
1159
However, Hajjis’ more favorable views of women do not extend to challenging male authority in the household (Table VI, row (5)). The Hajj has little or no impact on an index that includes questions regarding whether the respondent challenges traditional attitudes on women’s roles in domestic matters, such as fertility decisions and marrying against parental wishes, and unequal Islamic rules on gender, such as those related to inheritance laws and providing financial witness. This is perhaps unsurprising given the greater authority and responsibility typically accorded to men along several dimensions within Islam. Nevertheless, the changed perceptions about gender roles do seem to accompany changes in household behavior. The Hajj increased the fraction reporting occasional marital disagreements by 10 percentage points, a large increase relative to the comparison mean of 15% (regression not reported). Because most married couples perform the Hajj together, it is not possible to separate this effect by the respondent’s gender (because it reflects both their own and their spouse’s Hajj impact). Although sample size limitations do not readily allow us to examine heterogeneity of the impact of the Hajj, nevertheless we find that only the girls’ education index shows a smaller increase for men than women, who in any case already have close to 100% agreement with the view that girls should be educated (regressions not reported). In fact, the Hajj leads to somewhat larger changes in the indices of views on women and quality of life for male Hajjis than for female ones. IV.D. Well-Being Hajjis, primarily women, are more likely to report negative feelings that suggest distress, and are less likely to report positive feelings of well-being (Table VII, rows (1), (2), (5), and (6)). This could potentially be due to the changes in Hajjis’ beliefs and frame of reference discussed above (which the psychology literature suggests can lead to stress), to financial stress associated with the cost of the Hajj, or to the impact of the Hajj on physical health. Hajjis report somewhat higher distress, as measured by a version of the K6 screening scale (Kessler et al. 2003).13 The index aggregates respondents’ experience of six negative feelings in the past month, which we rescale so that a higher value represents 13. We should caution that, to our knowledge, the K6 index has not been formally validated for Pakistan.
1160
QUARTERLY JOURNAL OF ECONOMICS
less distress. Although applicants had a low level of underlying distress, the Hajj reduces the index by 0.21 standard deviations (Table VII, row (1)). The Hajj also reduces an index of five positive feelings by 0.11 standard deviations (row (2)). In the restricted subsample, the Hajj effect drops slightly to 0.08 standard deviations with a marginal significance of 11%. The increase in distress falls entirely on women (Table VII, rows (5) and (6)). On both the rescaled K6 index and the positive feelings index, there is no significant effect of the Hajj on men. Increased distress might be due to the stark contrast between the typical Pakistani woman’s daily life and the relatively greater equality and integration experienced during the Hajj. The impact of the Hajj on gender attitudes suggests an increased realization that the constraints and restrictions women are accustomed to in Pakistan may not be part of global Islam. The literature in psychology (Crosby 1991; Lantz et al. 2005) suggests that such changes in frame of reference can induce significant stress, although eventually the stress helps deal with the change. Although the Hajj has a negative impact on a female pilgrim’s emotional state, it does not affect overall life satisfaction, either on average, or for women (rows (3) and (7)). We can reject a negative effect on the index of life satisfaction of about one-tenth standard deviation with 95% confidence. Although we cannot rule it out, we do not see much evidence for the hypothesis that the substantial financial expenditure required by the Hajj creates financial stress that accounts for Hajjis’ negative feelings. In fact, we can reject the hypothesis that the Hajj has a negative effect of more than one-twelfth of a standard deviation on the individual component question about satisfaction with finances. The Hajj also does not affect monthly household consumption expenditures or a measure of household assets (regressions not reported). Our interviews reveal that most do not consider the pool of savings for the Hajj as fungible; those unable to go keep these Hajj funds in order to reapply in the future. The Hajj leads to a 0.21–standard deviation reduction in an index of physical health (Table VII, row (4)) that includes selfreported physical health and illness/injury. Although the decrease in self-perceived health could be due to a change in the reference group for Hajjis from local people to those encountered from other countries on the Hajj, it is not clear that this can account for the doubling in reports of serious physical injury or illness.
ESTIMATING THE IMPACT OF THE HAJJ
1161
The negative physical health effects are also stronger for women (row (8)), suggesting that part of the negative effect of the Hajj on women’s feelings of well-being could be explained by poorer physical health.14 However, the negative point estimates on men’s physical health are larger than the effect on the K6 index (0.11 vs. 0.04 standard deviations), suggesting that the two do not exactly co-move. Also, the coefficient on Hajj lottery success in a regression predicting the K6 index is similar whether or not one controls for physical health, providing further suggestive evidence that the channel for Hajj effects on emotional health is not simply through physical health.
V. POTENTIAL CHANNELS Although our methodology does not provide experimental variation that isolates the potential channels through which the Hajj may impact the pilgrim, we can offer some suggestive evidence. We consider both external channels, which operate by changing the environment a Hajji faces upon return, and internal channels, which reflect changes in Hajjis’ beliefs and preferences. We argue that the evidence points toward the importance of the internal channel and, within that, to exposure to people of differing nationalities, sects, and gender. V.A. External Social Environment Historical accounts suggest that the Hajj confers social prestige and legitimacy (Donnan 1989; Eickelman and Piscatori 1990; Yamba 1995), although some anecdotal evidence suggests that contemporary Hajjis no longer experience this increase in social status (Scupin 1982). A changed social role may bring expectations for the changed behavior and beliefs that are reflected in our results. For example, Hajjis may be expected to be more religious, and may practice more to fulfill that expectation. Alternatively, increased religious legitimacy may allow Hajjis to express longstanding opinions they have not expressed before, such as those opposing Osama Bin Laden. We find no impact of the Hajj on an index of social status and engagement (Table VIII, row (1)). The fifteen components include the frequency of social visits, the giving of advice, and 14. Negative physical health effects are not larger for older people.
1162
QUARTERLY JOURNAL OF ECONOMICS
membership in social organizations.15 We might further expect social standing to be reflected in awareness of and engagement in political affairs. In fact, we find no impact of the Hajj on a political engagement index (Table VIII, row (2)) that asks about voting, interest in national affairs, political opinion, and membership in political organizations. It’s possible that the Hajj once led to a much greater change in social roles than it does currently, and that the increased rate of participation in the Hajj due to lower travel costs has reduced the social prestige associated with completing the Hajj. In any case, it does not seem likely that changes in the social role upon return can account for the findings in the previous sections. V.B. Internal State The Hajj may alter an individual’s internal state, changing beliefs and preferences. For example, Hajjis may undergo a change in religious commitment during the pilgrimage that increases orthodoxy in religious practice and leads them to greater tolerance and belief in gender equity consistent with the Qur’an. Alternatively, Hajjis’ increased tolerance and changed gender attitudes may reflect their new exposure to people from different countries and sects and to members of the opposite gender outside their family. Although we cannot rule out the religious dimension, we interpret the evidence as pointing more toward the increased exposure to Muslims from around the world. We find that the Hajj does not increase an index of formal religious knowledge (Table VIII, row (3)) but does increase indices of experiential knowledge about diversity of opinion within Islam, gender within Islam, and the world more broadly (Table VIII, rows (4)–(6)). The changes in experiential knowledge point to the importance of interaction with and observation of other groups.16 Furthermore, to the extent that a spiritual transformation and change in religious commitment would be accompanied by a desire to acquire greater religious knowledge, these results do not suggest that such a change is a primary driver of the findings. 15. All except two component questions are not significant even at the 20% level (Online Appendix 5). These two show that Hajjis are slightly more likely to have visitors from out of town, and slightly more likely to be self-employed ( p-values of .14 each). However, the magnitudes of these effects are small (5% and 3%). 16. Although some of our results could be due to a generic effect of traveling to a different country rather than the experience of the Hajj, it seems unlikely this accounts for all of the results. For example, it is hard to see why a pure travel effect would lead to more positive beliefs about Indonesians.
ESTIMATING THE IMPACT OF THE HAJJ
1163
The Hajj increases the index measuring knowledge of diversity within Islam by 0.15 standard deviations (Table VIII, row (4)). Index components include questions on how schools of Sunni thought differ, such as whether it is necessary to wear a prayer cap. The index of gender knowledge and awareness, which combines eight questions on gender and marriage in Islam and on having an opinion on women’s issues, increases by 0.13 standard deviations (Table VIII, row (5)).17 Similarly, Hajjis also show an increase of 0.08 standard deviations in the global knowledge index, which reflects general awareness of the world outside Pakistan (Table VIII, row (6)).18 Pilgrims who travel in smaller parties, and thus have more opportunity to interact with non-Pakistanis, experience larger gains in the diversity, gender, and global knowledge indices, as well as in positive views of people from other countries. This is consistent with the idea that the exposure channel is important. The coefficients on the interaction between the Hajj and small party size are large and significant at conventional levels for the gender and global knowledge indices and for positive views of other countries. The small-party interaction effects are 0.13, 0.14, and 0.14 standard deviations, respectively, with p-values of .07, .10, and .06. Point estimates for the group size interaction on other tolerance indices also point to a similar story. The interactions are robust to including other demographic controls and their interactions. However, we cannot rule out that unobservable differences between parties of different size are driving the interaction effects. We would expect Hajj effects to also be larger for those with less prior exposure to situations similar to the Hajj. However, very few respondents had previously traveled outside Pakistan, which limits our power to test this interaction. There are a few robust interactions with literacy and urban residence, though it is unclear if these relate to the prior exposure that is relevant. Although urban applicants do show a smaller decrease in localized beliefs and practices, the literate see larger gains in some of the
17. Four of the eight questions in the gender knowledge and awareness index are about awareness rather than knowledge: whether the respondent has heard of the Islamic law against adultery and whether they have an opinion on women’s lives in three different countries. The Hajj has a somewhat smaller, but still significant, effect on a pure gender knowledge index that is constructed without these questions. 18. We should note that this latter effect falls slightly to 0.07 standard deviations and is marginally significant at 12% in the restricted subsample.
1164
QUARTERLY JOURNAL OF ECONOMICS
experiential knowledge measures, suggesting that literacy may partly be picking up other factors such as ability to interact with others. Because party size is often assigned by banks, and is thus less likely to be correlated with unobserved factors, we prefer focusing on it as a test of exposure. Although Hajjis are also exposed to Saudi Arabia and its people, we think this is unlikely to drive our observed effects. Only the move away from localized religious practices seems consistent with a Saudi influence. Saudi Arabia is generally less accepting of other schools of thought and enforces strict gender segregation; Hajj impacts on gender views are more in line with the more liberal attitudes in other Muslim countries.19 Our results thus suggest that Hajjis are likely to be influenced by the practices and beliefs of the typical pilgrim that they encounter during the Hajj, with possibly greater salience to those groups that are more visibly different or are regarded as better in some way, such as in their behavior or organization (factors often mentioned in our interviews). Exposure may therefore induce convergence of belief to the Islamic mean. To the extent that this convergence is a significant force, some of our results may differ for pilgrims from other countries.
VI. CONCLUSIONS Our findings show that the Hajj induces a shift from localized beliefs and practices toward global Islamic practice, increases tolerance and peaceful inclinations, and leads to more favorable attitudes toward women. This demonstrates that deep-rooted attitudes such as religious beliefs and views about others can be changed and also challenges the view that Islamic orthodoxy and extremism are necessarily linked. We conclude with some tentative implications of our results on how social institutions help shape individual beliefs and identity and, at a macro level, how they may foster unity within belief systems. 19. A comparison of gender views across questions from the World Values Surveys shows that Saudis indeed have more conservative gender views than Pakistanis, whereas Pakistanis in turn are more conservative than Indonesians. Fully 62% of Saudis believe a university education is more important for men than women, compared to 24% of Pakistanis and 17% of Indonesians. Similarly, 34% of Saudis do not think that both husband and wife should contribute to household income, compared to 30% of Pakistanis and 15% of Indonesians.
ESTIMATING THE IMPACT OF THE HAJJ
1165
The social psychology literature suggests that social interactions can lead to either positive or negative feelings toward other groups depending on whether the setting is competitive or cooperative. Several features of the Hajj may create a setting in which the interaction among different groups helps build common purpose and identity. It’s worth noting that other social institutions also share such features with the Hajj. Consider medical education, police/military basic training, and international peace camps. Like the Hajj, participants in these institutions leave their everyday environments and their restrictions on mixing across certain lines, such as ethnicity and social class, to enter a setting in which they collectively perform similar actions, often physically strenuous ones, which require cooperation from others. Furthermore, participants in all these institutions accentuate their similarity, often by taking on common dress or hairstyle during the experience and a common title afterward. It also seems likely that the religious element of the Hajj plays a role beyond providing a cooperative setting. For example, Hajjis’ changed attitudes on gender appear to be circumscribed by those norms broadly accepted in Islam. Further, it is plausible that the religious context provides the legitimacy that makes it acceptable for adherents to alter their views. If a Pakistani woman observes her Indonesian counterpart engaging equally with her spouse without compromising her piety, she may also consider it permissible to do so. If pilgrims see others praying somewhat differently yet without interference in the holiest of Muslim places, they may reason that some degree of religious diversity is acceptable. Our results also shed light on why religions often mandate practices that are costly for individual adherents. Although club good models that apply the framework of individual rationality, as in Iannaccone (1992) and Berman (2000), deliver compelling explanations, additional insights can be obtained using an evolutionary framework in which institutions and prescriptions that reinforce and propagate the religion’s beliefs and practices are more likely to persist. By moving pilgrims toward the religious mainstream, the Hajj may help Islam overcome an evolutionary hurdle faced by world religions: maintaining unity in the face of the divergence of practices and beliefs through local adaptations. A number of religious institutions, including written holy texts and central authorities, can help overcome this hurdle. Sunni
1166
QUARTERLY JOURNAL OF ECONOMICS
Islam lacks a central authority, and so the role of pilgrimage may be particularly important. However, achieving convergence and maintaining unity likely require that there be limits to how much diversity is allowed. Too diverse a group may make it difficult to find common ground and too much variance in beliefs increases the likelihood that undesirable religious innovations will spread. It is therefore noteworthy that although people of different faiths made the pilgrimage to Mecca in pre-Islamic times (Armstrong 1997), its institutionalization with Islam’s emergence was accompanied by restricting it to Muslims and disallowing non-Islamic practices that were once elements of the pilgrimage. Both the evolutionary and club good perspectives imply that religions with practices that generate positive externalities for other adherents, provided these are socially efficient, are more likely to persist by raising the attractiveness of being an adherent. Historically, undertaking the Hajj may have created positive externalities for other Muslims both through its effect on tolerance and by facilitating economic trade and the diffusion of economic, cultural, and scientific ideas (Bose 2006). Although we find little evidence of individual medium-term gains in socioeconomic status and engagement in our sample, there is clear evidence for a positive externality in the increased tolerance toward others. Of course, given the significant financial and health costs entailed in undertaking the Hajj, individuals still need to be induced to participate. However, this can be done through religious injunctions, sanctions, and rewards. The Hajj is one of the five pillars of Islam and there is the belief that performing it sincerely cleanses one of all sins. Models of costly religious practices also often argue that these practices signal commitment and screen out those who may free ride on the religious community. Because individuals have already signaled such commitment by applying to the Hajj (less than 1% withdraw), our comparisons between applicants are not influenced by signaling effects. Our results therefore indeed capture a treatment effect. The fact that the Hajj has a direct treatment effect is not surprising, because one would expect that were it only serving a signaling function, it would decline in observance relative to alternate practices that could screen just as well but at a lower cost. The findings in this paper also pose the question of whether pilgrimages or central gatherings may foster such unity in other
ESTIMATING THE IMPACT OF THE HAJJ
1167
belief systems, religious or otherwise, and conversely, whether their absence increases susceptibility to schisms. The Kumbh Mela, bringing together millions of Hindus every three years, along with Catholic pilgrimages to Lourdes and Rome, may play such a cohesive role. Nonreligious examples include national political conventions in the United States that may promote party unity and exchange among delegates from different regions. Conversely, the split between Judaism and Christianity occurred shortly after the destruction by the Romans of the Jewish temple in Jerusalem, which was a central gathering place, in the year 70 A.D. One may even conjecture whether the multiplication of Protestant sects would have been muted had there been a central holy site for pilgrimage among Protestants. Further insights are likely to come from investigating the impact of the Hajj over different durations and on pilgrims from other countries that differ from Pakistan in their attitudes and exposure, and the impact of other pilgrimages. Because several other countries also allocate Hajj visas by lottery, it should be possible to use the same methodology. More generally, one could use similar approaches to examine the impact of other institutions on social identity. For example, one could use draft lotteries (Angrist 1990) to examine the impact of military service on social identity or regression discontinuity designs to examine the impact of professional training on beliefs and attitudes. Building up evidence from a series of such studies would shed additional light on the broader roles played by institutions, religious and nonreligious, in the shaping of beliefs and identity and the evolution of ideologies and belief systems.
APPENDIX: OUTLINE OF THE HAJJ PILGRIMAGE
1168 QUARTERLY JOURNAL OF ECONOMICS
ESTIMATING THE IMPACT OF THE HAJJ
1169
CASE WESTERN RESERVE UNIVERSITY HARVARD UNIVERSITY HARVARD UNIVERSITY, THE BROOKINGS INSTITUTION, AND THE NATIONAL BUREAU OF ECONOMIC RESEARCH
REFERENCES Ahmed, Q. A., Y. M. Arabi, and Z. A. Memish, “Health Risks at the Hajj,” The Lancet, 367 (2006), 1008–1015. Angrist, Joshua, 1990. “Lifetime Earnings and the Vietnam Era Draft Lottery: Evidence from Social Security Administrative Records,” American Economic Review, 80 (1990), 313–336. Armstrong, K., Jerusalem: One City, Three Faiths (New York: Ballantine Books, 1997). Aronson, E., and S. Patnoe, The Jigsaw Classroom (New York: Longman, 1997). Azarya, V., Aristocrats Facing Change: The Fulbe in Guinea, Nigeria, and Cameroon (Chicago: University of Chicago Press, 1978). Barro, Robert, and R. M. McCleary, “Religion and Economic Growth across Countries,” American Sociological Review, 68 (2003), 760–781. Berman, Eli, “Sect, Subsidy, and Sacrifice: An Economist’s View of Ultra-orthodox Jews,” Quarterly Journal of Economics, 115 (2000), 905–953. Bianchi, R., Guests of God: Pilgrimage and Politics in the Islamic World (Oxford, UK: Oxford University Press, 2004). Boisjoly, J., G. Duncan, M. Kremer, D. Levy, and J. Eccles, “Empathy or Antipathy? The Consequences of Racially and Socially Diverse Peers on Attitudes and Behaviors,” American Economic Review, 96 (2006), 1890–1906. Bose, Sugata, A Hundred Horizons: The Indian Ocean in the Age of Global Empire (Cambridge, MA: Harvard University Press, 2006). Clingingsmith, David, Asim Ijaz Khwaja, and Michael Kremer, “Estimating the Impact of the Hajj: Religion and Tolerance in Islam’s Global Gathering,” KSG Working Paper RWP08-22, 2008. Crosby, F., Juggling: The Unexpected Advantages of Balancing Career and Home for Women and Their Families (New York: Free Press, 1991). DeVries, D. L., and R. E. Slavin, “Teams-Games-Tournaments (TGT): Review of Ten Classroom Experiments,” Journal of Research and Development in Education, 12 (1978), 28–38. Donnan, Hastings, “Symbol and Status: The Significance of the Hajj in Pakistan,” Muslim World, 79 (1989), 205–216. Eickelman, Dale F., and James Piscatori, “Muslim Travellers: Pilgrimage, Migration, and the Religious Imagination,” in Comparative Studies on Muslim Societies, Vol. 9, Barbara D. Metcalf, ed. (Berkeley: University of California Press, 1990). Fisman, Raymond, S. Iyengar, E. Kamenica, and Itamar Simonson, “Racial Preferences in Dating,” Review of Economic Studies, 75 (2008), 117–132. Glaeser, E., and S. Glendon, “Incentives, Predestination and Free Will,” Economic Inquiry, 36 (1998), 429–443. Guiso, Luigi, Paola Sapienza, and Luigi Zingales, “People’s Opium? Religion and Economic Attitudes,” Journal of Monetary Economics, 50 (2003), 225–282. Iannaccone, Laurence R., “Sacrifice and Stigma: Reducing Free-Riding in Cults, Communes, and Other Collectives,” Journal of Political Economy, 100 (1992), 271–291. Ibn Battuta, Tuhfat al-Nuzzar fi Ghara’ib al-Amsar (Beirut: Dar al-Kutub al-’Ilmiyya, 2002 [originally published ca. 1355]). Johnson, D. W., and R. T. Johnson, 1983. “The Socialization and Achievement Crisis: Are Cooperative Learning Experiences the Solution?” in Applied Social Psychology Annual, Vol. 4, L. Bickman, ed. (Beverly Hills, CA: Sage, 1983). Kessler, R. C., P. R. Barker, L. J. Colpe, J. F. Epstein, J. C. Gfroerer, E. Hiripi, M. J. Howes, S-L. T. Normand, R. W. Manderscheid, E. E. Walters, and A. M. Zaslavsky, “Screening for Serious Mental Illness in the General Population,” Archives of General Psychiatry, 60 (2003), 184–189.
1170
QUARTERLY JOURNAL OF ECONOMICS
Kling, J. R., J. B. Liebman, L. F. Katz, and L. Sanbonmatsu, “Moving to Opportunity and Tranquility: Neighborhood Effects on Adult Economic Self-Sufficiency and Health from a Randomized Housing Voucher Experiment,” Princeton IRS Working Paper 481, 2004. Lantz, P., J. House, R. Mero, and D. Williams, “Stress, Life Events, and Socioeconomic Disparities in Health: Results from the Americans’ Changing Lives Study,” Journal of Health and Social Behavior, 46 (2005), 274–288. Low, Michael, Empire of the Hajj: Pilgrims, Plagues, and Pan-Islam under British Surveillance, unpublished M.A. thesis, 2007. McCleary, R. M., “Salvation, Damnation, and Economic Incentives,” Journal of Contemporary Religion, 22 (2007), 49–74. Naipaul, V. S., Among the Believers: An Islamic Journey (New York: Vintage, 1981). O’Brien, Peter C., “Procedures for Comparing Samples with Multiple Endpoints,” Biometrics, 40 (1984) 1079–1087. Organization of the Islamic Conference, Report on the Seventeenth Islamic Conference of Foreign Ministers: Resolutions on Legal, Information, and Political Affairs (http://www.oicoci.org/english/conf/fm/17/17%20icfm-political-en.htm, 2007). Pettigrew, T. F., and L. R. Tropp, “A Meta-Analytic Test of Intergroup Contact Theory,” Journal of Personality and Social Psychology, 90 (2006), 751–783. PEW Forum, “Public Expresses Mixed Views of Islam, Mormonism,” in Pew Forum on Religion & Public Life (http://pewforum.org/surveys/religionviews07/, 2007). Putnam, Robert, “American Grace: The Changing Role of Religion in America,” Presentation, John F. Kennedy School of Government, Harvard University, Fall 2007 Faculty Research Seminar. Sacerdote, B., and E. L. Glaeser, “Education and Religion,” National Bureau of Economic Research Working Paper 8080, 2001. Scupin, Raymond, “The Social Significance of the Hajj for Thai Muslims,” Muslim World, 72 (1982), 25–33. Sherif, M., O. Harvey, B. White, W. Hood, and Carolyn Sherif, Intergroup Conflict and Cooperation: The Robber’s Cave Experiment (Norman: University of Oklahoma Press, 1954). Slavin, R. E., and R. Cooper, 1999. “Improving Intergroup Relations: Lessons Learned from Cooperative Learning Programs,” Journal of Social Issues, 55 (1999), 647–664 Stephan, W. G., “School Desegregation: An Evaluation of Predictions Made in Brown v. Board of Education,” Psychological Bulletin, 85 (1978), 217–238. The Sunday Times, “Terror Watch on Mecca Pilgrims,” January 21, 2007. Tajfel, H., “Experiments in Intergroup Discrimination,” Scientific American, 223 (1970), 96–102. Tajfel, H., and J. C. Turner, “The Social Identity Theory of Inter-Group Behavior,” in Psychology of Intergroup Relations, S. Worchel and L. W. Austin, eds. (Chicago: Nelson-Hall, 1986). Wolfe, M., One Thousand Roads to Mecca: Ten Centuries of Travelers Writing about the Muslim Pilgrimage (New York: Grove Press, 1997). X, Malcolm, The Autobiography of Malcolm X, with Alex Haley (New York: Grove Press, 1965). Yamba, C., Permanent Pilgrims: The Role of Pilgrimage in the Lives of West African Muslims in Sudan (Edinburgh: Edinburgh University Press, 1995).
MULTINATIONAL FIRMS, FDI FLOWS, AND IMPERFECT CAPITAL MARKETS∗ ` POL ANTRAS MIHIR A. DESAI C. FRITZ FOLEY This paper examines how costly financial contracting and weak investor protection influence the cross-border operational, financing, and investment decisions of firms. We develop a model in which product developers can play a useful role in monitoring the deployment of their technology abroad. The analysis demonstrates that when firms want to exploit technologies abroad, multinational firm (MNC) activity and foreign direct investment (FDI) flows arise endogenously when monitoring is nonverifiable and financial frictions exist. The mechanism generating MNC activity is not the risk of technological expropriation by local partners but the demands of external funders who require MNC participation to ensure value maximization by local entrepreneurs. The model demonstrates that weak investor protections limit the scale of MNC activity, increase the reliance on FDI flows, and alter the decision to deploy technology through FDI as opposed to arm’s length technology transfers. Several distinctive predictions for the impact of weak investor protection on MNC activity and FDI flows are tested and confirmed using firm-level data.
I. INTRODUCTION Firms globalizing their operations and the associated capital flows have become major features of the world economy. These cross-border activities and capital flows span institutional settings with varying investor protections and levels of capital market development. Although the importance of institutional heterogeneity in dictating economic outcomes has been emphasized, existing analyses typically ignore the global firms and the capital flows that are now commonplace. Investigating how global firms make operational and financing decisions in a world of heterogeneous institutions promises to provide a novel perspective on observed patterns of flows and firm activity. ∗ The statistical analysis of firm-level data on U.S. multinational companies was conducted at the Bureau of Economic Analysis, U.S. Department of Commerce, under arrangements that maintain legal confidentiality requirements. The views expressed are those of the authors and do not reflect official positions of the U.S. Department of Commerce. The authors thank Robert Barro, four anonymous referees, Gita Gopinath, James Markusen, Aleh Tsyvinski, Bill Zeile, and seminar participants at Boston University, Brown University, Hitotsubashi University, MIT, the NBER ITI program meeting, the New York Fed, Oxford, UC Berkeley, UC Boulder, Universidad de Vigo, Universitat Pompeu Fabra, the University of Michigan, and the World Bank for helpful suggestions. Davin Chor provided excellent research assistance. C 2009 by the President and Fellows of Harvard College and the Massachusetts Institute of
Technology. The Quarterly Journal of Economics, August 2009
1171
1172
QUARTERLY JOURNAL OF ECONOMICS
This paper develops and tests a model of the operational and financial decisions of firms as they exploit their technologies in countries with differing levels of investor protections. The model demonstrates that multinational firm (MNC) activity and foreign direct investment (FDI) arise endogenously in settings characterized by financial frictions. The model generates several predictions regarding how investor protections influence the use of arm’s length technology transfers, the degree to which MNC activity is financed by capital flows, the extent to which multinationals take ownership in foreign projects, and the scale of multinational operations. These predictions are tested using firm-level data on U.S. MNCs. The model considers the problem of a firm that has developed a proprietary technology and is seeking to deploy this technology abroad with the help of a local entrepreneur. A variety of alternative arrangements, including an arm’s length technology transfer or directly owning and financing the entity that uses it, are considered. External investors are a potential source of funding, but they are concerned with managerial misbehavior, particularly in settings where investor protections are weak. The central premise of the model is that developers of technologies are particularly useful monitors for ensuring that local entrepreneurs are pursuing value maximization. The concerns of external funders regarding managerial misbehavior lead to optimal contracts in which the developer of the technology is required to hold an ownership claim in the foreign project and, in certain cases, this developer is also required to provide financial capital to the local entrepreneur. As such, MNCs and FDI flows arise endogenously in response to concerns over managerial misbehavior and weak investor protections.1 Extending the model to allow for a similar form of monitoring by external investors does not vitiate the primary results. We also show that although simple revenue-sharing agreements may also provide incentives for technology developers to monitor, this type of contract is generally not optimal. 1. The experience of Disney in Japan, as documented in Misawa (2005), provides one example of the mechanism that drives the behavior of external investors. In 1997, Disney was evaluating how to structure a new opportunity with a local partner in Japan. Japanese banks expressed a strong preference for equity participation by Disney over a licensing agreement in order to ensure that Disney had strong incentives to monitor the project and ensure value maximization. The concerns of these lenders and the intuition that Disney would have a unique ability to monitor local partners are reflective of the central ideas of the model.
MULTINATIONAL FIRMS AND IMPERFECT CAPITAL MARKETS
1173
The characterization of MNCs as developers of technologies has long been central to models explaining MNC activity. In contrast to those models that emphasize the risk of technology expropriation, the model in this paper emphasizes financial frictions, a cruder form of managerial opportunism and the role of external funders. As such, although technology is central to these other models and the model in this paper, the mechanism generating MNC activity is entirely distinct. Our emphasis on monitoring builds on the theory presented by Holmstrom and Tirole (1997), which captures how monitoring is critical to understanding financial intermediation. Our model delivers several novel predictions about the nature of FDI and patterns of MNC activity. First, the model predicts that arm’s length technology transfers will be more common, relative to the deployment of that technology through affiliate activity, in countries where investor protections are stronger. Second, the share of activity abroad financed by capital flows from the multinational parent will be decreasing in the quality of investor protections in host economies. Third, ownership shares by multinational parents will also be decreasing in the quality of investor protections in host economies. These predictions reflect the fact that monitoring by the developer of the technology is more critical in settings where investor protections are weaker. The model also predicts that the scale of activity based on multinational technologies in host countries will be an increasing function of the quality of the institutional environment. Better investor protections reduce the need for monitoring and therefore allow for a larger scale of activity. We test these predictions using the most comprehensive available data on the activities of U.S. MNCs and on arm’s length technology transfers by U.S. firms. These data provide details on the worldwide operations of U.S. firms, including measures of parental ownership, financing and operational decisions, and information on royalty payments and licensing fees received by U.S. firms from unaffiliated foreign persons. The data enable the use of parentyear fixed effects that implicitly control for a variety of unobserved attributes. The analysis indicates that the likelihood of using arm’s length technology transfer to serve a foreign market increases with measures of investor protections, as suggested by the model. The predictions on parent financing and ownership decisions are also confirmed to be a function of the quality of investor
1174
QUARTERLY JOURNAL OF ECONOMICS
protections and the depth of capital markets. The model also suggests that these effects should be most pronounced for technologically advanced firms because these firms are most likely to be able to provide valuable monitoring services. The empirical evidence indicates a differential effect for such firms. Settings where ownership restrictions are liberalized provide an opportunity to test the final prediction of the model. The model implies that ownership liberalizations should have a particularly large effect on multinational affiliate activity in countries with weak investor protections. Our empirical analysis confirms that affiliate activity increases by larger amounts after liberalizations in countries with weaker investor protections. This paper extends the large and growing literature on the effects of investor protections and capital market development on economic outcomes to an open-economy setting where firms make operational and financial decisions across borders. La Porta et al. (1997, 1998) relate investor protections to the concentration of ownership and the depth of capital markets. A large literature, including King and Levine (1993), Levine and Zervos (1998), Rajan and Zingales (1998), Wurgler (2000), and Acemoglu, Johnson, and Mitton (2005), has shown that financial market conditions influence firm investment behavior, economic growth, and industrial structure. By exclusively emphasizing firms with local investment and financing, this literature has neglected how cross-border, intrafirm activity responds to institutional variations. The open-economy dimensions of institutional variations have been explored but overwhelmingly in the context of arm’s-length cross-border lending as in Gertler and Rogoff (1990), Boyd and Smith (1997), and Shleifer and Wolfenzon (2002).2 In related work, Albuquerque (2003) shows that the differential alienability of FDI and portfolio inflows can allow the risk of expropriation to alter the composition of capital inflows. In contrast to this work, we derive the existence of MNCs and FDI flows in response to the possibility of opportunism by private actors. Accordingly, our empirical work employs firm-level data that allow us to 2. Gertler and Rogoff (1990) show how arm’s length lending to entrepreneurs in poor countries is limited by their inability to pledge large amounts of their own wealth. This insight is embedded into an MNC’s production decisions in the model presented here. Our setup also relates to Shleifer and Wolfenzon (2002), who study the interplay between investor protection and equity markets. In contrast, Kraay et al. (2005) emphasize the role of sovereign risk in shaping the structure of world capital flows.
MULTINATIONAL FIRMS AND IMPERFECT CAPITAL MARKETS
1175
analyze both patterns of firm activity and financial flows rather than the division of aggregate capital flows between FDI and portfolio flows. In short, we show that weak financial institutions decrease the scale of MNC activity but simultaneously increase the reliance on capital flows from the parent. As such, observed patterns of capital flows reflect these two distinct and contradictory effects. The empirical investigations of microdata provided in the paper indicate that both effects are operative.3 By jointly considering the determinants of MNC activities and the flows of capital that support these activities, the paper also links two literatures—the international trade literature on multinationals and the macroeconomic literature on capital flows. Industrial-organization and international-trade scholars characterize multinationals as having proprietary assets and emphasize the role of market imperfections, such as transport costs and market power, in determining patterns of multinational activity. Recent work on MNCs investigates “horizontal” or “vertical” motivations4 for FDI and explores why alternative productive arrangements, such as whole ownership of foreign affiliates, joint ventures, exports, or arm’s length contracts, are employed.5 Such analyses of MNC activity typically do not consider associated capital flows.6 Research on capital flows typically abstracts 3. It should be emphasized that our model abstracts from any portfolio decision by investors and instead focuses on the financing decisions of firms. Bertaut, Griever, and Tryon (2006) analyze U.S. ownership of foreign securities and conclude that nonfinancial institutions are a fairly small fraction (less than 10%) of overall foreign portfolio investment, and this is when including all securities such as fixed-income investments. As such, our model (unlike the one in Albuquerque [2003]) may not be particularly well suited to interpret cross-country patterns in the composition of capital flows. 4. The horizontal FDI view represents FDI as the replication of capacity in multiple locations in response to factors such as trade costs, as in Markusen (1984), Brainard (1997), Markusen and Venables (2000), and Helpman, Melitz, and Yeaple (2004). The vertical FDI view represents FDI as the geographic distribution of production globally in response to the opportunities afforded by different markets, as in Helpman (1984) and Yeaple (2003). Caves (1996) and Markusen (2002) provide particularly useful overviews of this literature. ` (2003, 2005), Antras ` and Helpman 5. Ethier and Markusen (1996), Antras (2004), Desai, Foley, and Hines (2004a), Grossman and Helpman (2004), and Feenstra and Hanson (2005) analyze the determinants of alternative foreign production arrangements. 6. Several studies linking levels of MNC activity and FDI flows are worth noting. First, high-frequency changes in FDI capital flows have been linked to relative wealth levels through real exchange rate movements (as in Froot and Stein [1991] and Blonigen [1997]), broader measures of stock market wealth (as in Klein and Rosengren [1994] and Baker, Foley, and Wurgler [2009]) and to credit market conditions (as in Klein, Peek, and Rosengren [2002]). Second, MNCs have also been shown to opportunistically employ internal capital markets in weak institutional environments (as in Desai, Foley, and Hines [2004b]) and during currency crises (as in Aguiar and Gopinath [2005] and Desai, Foley, and Forbes [2008]). These
1176
QUARTERLY JOURNAL OF ECONOMICS
from firm activity and has focused on the paradox posed by Lucas (1990) of limited capital flows from rich to poor countries in the face of large presumed rate-of-return differentials. Whereas Lucas (1990) emphasizes human-capital externalities to help explain this paradox, Reinhart and Rogoff (2004) review subsequent research on aggregate capital flows and conclude that credit market conditions and political risk play significant roles. By examining firm behavior in a setting with variation in investor protections, this paper attempts to unify an investigation of MNC activity and FDI flows. The rest of the paper is organized as follows. Section II lays out the model and generates several predictions related to the model. Section III provides details on the data employed in the analysis. Section IV presents the results of the empirical analysis, and Section V concludes. II. THEORETICAL FRAMEWORK In this section, we develop a model of financing that builds on and extends the work of Holmstrom and Tirole (1997).7 We illustrate how the model generates both multinational activity as well as FDI flows. Finally, we explore some firm-level empirical predictions that emerge from the model and that we take to the data in later sections. II.A. A Model of Financial Contracting Environment. We consider the problem of an agent—an inventor—who is endowed with an amount W of financial wealth and the technology or knowledge to produce a differentiated good. Consumers in two countries, Home and Foreign, derive utility from consuming this differentiated good. (Appendix I develops a multicountry version of the model.) The good, however, is papers emphasize how heterogeneity in access to capital can interact with MNC production decisions. Marin and Schnitzer (2004) also study the financing decisions of MNCs in a model that stresses managerial incentives. Their model, however, takes the existence of MNCs as given and considers an incomplete-contracting setup, in contrast to our complete-contracting setup. The predictions from their model are quite distinct to the ones we develop here and show to be supported by U.S. data. 7. Our model generalizes the setup in Holmstrom and Tirole (1997) by allowing for diminishing returns to investment and for variable monitoring levels. The scope of the two papers is also very distinct: Holmstrom and Tirole (1997) study the monitoring role of banks in a closed-economy model, whereas our focus is on MNCs.
MULTINATIONAL FIRMS AND IMPERFECT CAPITAL MARKETS
1177
prohibitively costly to trade, and thus servicing a particular market requires setting up a production facility in that country. The inventor is located at Home and cannot fully control production in Foreign. Servicing that market thus requires contracting with a foreign agent—an entrepreneur—to manage production there. We assume that entrepreneurs are endowed with no financial wealth and their outside option is normalized to 0. There also exists a continuum of infinitessimal external investors in Foreign that have access to a technology that gives them a gross rate of return equal to 1 on their wealth. All parties are risk neutral and are protected by limited liability. There are three periods: a date-0 contracting stage, a date-1 investment stage, and a date-2 production/consumption stage. Consumer Preferences and Technology. In the main text, we focus on describing production and financing decisions in the Foreign market. For that purpose, we assume that preferences and technology at Home are such that at date 2 the inventor obtains a constant gross return β > 1 for each unit of wealth he invests in production at Home at date 1. We refer to this gross return as the inventor’s shadow value of cash. Our assumption β > 1 implies that the opportunity cost of funds is lower for external investors than for the inventor. In Appendix I, this higher-than-1 value of β is endogenously derived in a multicountry version of the model where consumer preferences, technology, and financial contracting in all countries are fully specified. Note that the provision that β > 1 does not imply that the effective cost of capital provided by external investors is always lower than the effective cost of capital provided by the inventor because informational frictions may drive a wedge between returns earned and the costs borne by the relevant parties. We assume that Foreign preferences are such that cash flows or profits obtained from the sale of the differentiated good in Foreign can be expressed as a strictly increasing and concave function of the quantity produced; that is, R(q), with R (q) > 0 and R (q) ≤ 0. We also assume the standard conditions R(0) = 0, limq→0 R (q) = +∞, and limq→∞ R (q) = 0. These properties of R(q) can be derived from preferences featuring a constant (and higher-than-1) elasticity of substitution across a continuum of differentiated goods produced by different firms. In such case, the elasticity of R(q) with respect to q is constant and given by a parameter α ∈ (0, 1).
1178
QUARTERLY JOURNAL OF ECONOMICS
Foreign production is managed by the foreign entrepreneur, who at date 1 can privately choose to behave and enjoy no private benefits, or misbehave and take private benefits. When the manager behaves, the project performs with probability pH , in the sense that when an amount x is invested at date 1, project cash flows at date 2 are equal to R(x) with probability pH and 0 otherwise.8 When the manager misbehaves, the project performs with a lower probability pL < pH and expected cash flows are pL R(x). We assume that the private benefit a manager obtains from misbehaving is increasing in the size of the project, and for simplicity, we specify it as being proportional to the return of the project, that is, BR(x). In Section II.C, we discuss how similar results obtain if private benefits are proportional to the level of investment x. Managerial misbehavior and the associated private benefits can be manifested by choosing to implement the project in a way that generates perquisites for the manager or his associates, in a way that requires less effort, or in a way that is more fun or glamorous. As described below, we relate the ability to engage in such private benefits to the level of investor protections in Foreign as well as to the extent to which the entrepreneur is monitored. The idea is that countries with better investor protections tend to enforce laws that limit the ability of managers to divert funds from the firm or to enjoy private benefits or perquisites. This interpretation parallels the logic in Tirole (2005, p. 359). When investor protections are not perfectly secure, monitoring by third agents is helpful in reducing the extent to which managers are able to divert funds or enjoy private benefits. Following Holmstrom and Tirole (1997), we introduce a monitoring technology that reduces the private benefit of the foreign entrepreneur when he misbehaves. It is reasonable to assume that the inventor can play a particularly useful role in monitoring the behavior of the foreign entrepreneur because the inventor is particularly well informed about how to manage the production of output using its technology. Intuitively, the developer of a technology is in a privileged position to determine if project failure is associated with managerial actions or bad luck.9 We capture this in a 8. This assumes that, when the project succeeds, each unit invested results in a unit of output (q = x), whereas when the project fails, output is zero (q = 0). We relax the latter assumption in Section II.C. 9. An alternative way to interpret monitoring is as follows. Suppose that the foreign entrepreneur can produce the good under a variety (a continuum, actually) of potential techniques indexed by z ∈ [0, B]. Technique 0 entails a probability of success equal to pH and a zero private benefit. All techniques with z > 0 are
MULTINATIONAL FIRMS AND IMPERFECT CAPITAL MARKETS
1179
stark way by assuming that no other agent in the economy can productively monitor the foreign entrepreneur, though we discuss a more general setup in Section II.C. We assume that monitoring costs are proportional to the return of the project and when the inventor incurs an effort cost C R(x) in monitoring at date 1, the private benefit for the local entrepreneur is multiplied by a ¯ limC→∞ δ (C ) = 0, factor δ (C ), with δ (C ) < 0, δ (C ) > 0, δ(0) = δ, limC→0 δ (C ) = −∞, and limC→∞ δ (C ) = 0.10 This assumption reflects the idea that larger projects require effort to monitor. Section II.C considers the possibility that effort costs are proportional to investment levels and similar results follow. As mentioned earlier, the scope of private benefits is related to the level of investor protection of the host country by an index γ ∈ (0, 1). In particular, we specify that (1)
B (C; γ ) = (1 − γ ) δ(C).
Note that this formulation implies that ∂ B(·)/∂γ < 0, ∂ B(·)/∂C < 0, and ∂ 2 B(·)/∂C∂γ = −δ (C) > 0. This formulation captures the intuition that the scope for private benefits is decreasing in both investor protection and monitoring and that monitoring has a relatively larger effect on private benefits in countries with poor legal protection of investors. It also implies that parent monitoring substitutes for investor protection. The idea behind this assumption is that both parent monitoring and investor protections constrain managers and that parent monitoring is effective even in imperfect legal environments. This would be the case if, for example, parent monitoring during the production process prevented misbehavior from occurring, thus eliminating any need for legal action after improper behavior occurs. Contracting. We consider optimal contracting between three sets of agents: the inventor, the foreign entrepreneur, and foreign external investors. At date 0, the inventor and the foreign entrepreneur negotiate a contract that stipulates the terms under which the entrepreneur will exploit the technology developed by
associated with a probability of success equal to pL and a private benefit equal to z. Clearly, all techniques with z ∈ (0, B) are dominated from the point of view of the foreign entrepreneur, who will thus effectively (privately) choose either z = 0 or z = B, as assumed in the main text. Under this interpretation, we can think of monitoring as reducing the upper bound of [0, B]. 10. These conditions are sufficient to ensure that the optimal contract is unique and satisfies the second-order conditions.
1180
QUARTERLY JOURNAL OF ECONOMICS
the inventor. This contract includes a (possibly negative) date-0 transfer F from the inventor to the entrepreneur, as well as the agents’ date-2 payoffs contingent on the return of the project.11 When F > 0, the date-0 payment represents the extent to which the inventor cofinances the project in the Foreign country. When F < 0, this payment can be thought of as the price or up-front royalties paid for the use of the technology, which the inventor can invest in the Home market at date 1. The contract between the inventor and the entrepreneur also stipulates the date-1 scale of investment x, while the managerial and monitoring efforts of the entrepreneur and inventor, respectively, are unverifiable and thus cannot be part of the contract. Also at date 0, the foreign entrepreneur and external investors sign a financial contract under which the entrepreneur borrows an amount E from the external investors at date 0 in return for a date-2 payment contingent on the return of the project. We consider an optimal contract from the point of view of the inventor and allow the contract between the inventor and the entrepreneur to stipulate the terms of the financial contract between the entrepreneur and foreign external investors. We rule out “direct” financial contracts between the inventor and foreign external investors. This is justified in the extension of the model developed in Appendix I, where the inventor’s shadow value of cash β is endogenized. Given the payoff structure of our setup and our assumptions of risk neutrality and limited liability, it is straightforward to show that an optimal contract is such that all date-2 payoffs can be expressed as shares of the return generated by the project. All agents obtain a payoff equal to zero when the project fails (i.e., when the return is zero) and a positive payoff when the project succeeds (in which case cash flows are positive). When an agent’s share of the date-2 return is positive, this agent thus becomes an equity holder in the entrepreneur’s production facility.12 We define φ I and φ E as the equity shares held by the inventor and external investors, respectively, with the remaining share 1 − φ I − φ E 11. For simplicity, we assume that the inventor’s date-2 return in its Home market (which is not modeled in the main text) is not pledgeable in Foreign. 12. We focus on an interpretation of payoffs resembling the payoffs of an equity contract, but the model is not rich enough to distinguish our optimal contract from a standard debt contract. Our results would survive in a model in which agents randomized between using equity and debt contracts. In any case, we bear this in mind in the empirical section of the paper, where we test the predictions of the model.
MULTINATIONAL FIRMS AND IMPERFECT CAPITAL MARKETS
1181
accruing to the foreign entrepreneur. Notice that when φ I is large enough, the entrepreneur’s production facility becomes a subsidiary of the inventor’s firm. II.B. Optimal Contract and Empirical Predictions We next characterize an optimal contract that induces the entrepreneur to behave and the inventor to monitor. This optimal ˜ C} ˜ that solves the ˜ φ˜ I , x, ˜ φ˜ E , E, contract is given by the tuple { F, following program: max
I = φ I pH R(x) + (W − F ) β − C R(x)
s.t.
x ≤ E+ F pH φ E R(x) ≥ E pH (1 − φ E − φ I ) R(x) ≥ 0 ( pH − pL) (1 − φ E − φ I ) R(x) ≥ (1 − γ ) δ (C ) R (x ) ( pH − pL) φ I R(x) ≥ C R(x).
F,φ I ,x,φ E ,E,C
(P1)
(i) (ii) (iii) (iv) (v)
The objective function represents the payoff of the inventor. The first term represents the inventor’s dividends from the expected cash flows of the foreign production facility. The second term represents the gross return from investing his wealth W minus the date-0 transfer F in the Home market.13 The last term represents the monitoring costs. The first constraint is a financing constraint. Because the local entrepreneur has no wealth, his ability to invest at date 1 is limited by the sum of the external investors’ financing E and the cofinancing F by the inventor. The second inequality is the participation constraint of external investors, who need to earn at least an expected gross return on their investments equal to 1. Similarly, the third inequality is the participation constraint of the foreign entrepreneur, given his zero outside option. The fourth inequality is the foreign entrepreneur’s incentive compatibility constraint. This presumes that it is in the interest of the inventor to design a contract in a way that induces the foreign entrepreneur to behave. In Appendix II, we show that this will necessarily be the case, provided that γ is sufficiently large. The final constraint is the inventor’s incentive compatibility constraint: if this condition was not satisfied, the inventor’s payoff would be 13. We assume throughout that W is large enough to ensure that W − F ≥ 0 in equilibrium.
1182
QUARTERLY JOURNAL OF ECONOMICS
lower when exerting the monitoring level C˜ than when not doing so.14 In the program above, constraint (iii) will never bind. Intuitively, as is standard in incomplete information problems, the incentive compatibility constraint of the entrepreneur demands that this agent obtains some informational rents in equilibrium, and thus his participation constraint is slack. Conversely, the other four constraints will bind in equilibrium. This is intuitive for the financing constraint (i) and the participation constraint of investors (ii). It is also natural that the optimal contract from the point of view of the inventor will seek to minimize the (incentivecompatible) equity share accruing to the foreign entrepreneur, which explains why constraint (iv) binds. It is perhaps less intuitive that constraint (v) also binds, indicating that the optimal contract minimizes the equity share φ I allocated to the inventor. In particular, it may appear that a large φ I would be attractive because it may foster a larger level of cofinancing F at date 0, thereby encouraging investment. However, inspection of constraint (iv) reveals that a larger φ I decreases the ability of the entrepreneur to borrow from external investors, as it reduces his pledgeable income. Overall, one can show that, for a given level of monitoring, whether utility is transferred through an equity share or a date-0 lump-sum payment has no effect on the scale of the project. In addition, it is clear from the objective function that the inventor strictly prefers a date-0 lump-sum transfer because he can use these funds to invest domestically and obtain a gross rate of return β > 1 on them. Hence, the minimal incentive-compatible inventor equity share φ I is optimal. With these results at hand, it is immediate from constraint (v) that the optimal equity stake held by the inventor will be given by (2)
φ˜ I =
C˜ , pH − pL
which will be positive as long as C˜ is positive. In addition, manipulation of the first-order conditions of program (P1) delivers the following expression that implicitly determines the level of 14. Our derivation of this IC constraint assumes that if the inventor deviates ˜ it does so by setting C = 0 (which for large enough δ¯ would lead to a from C, violation of the entrepreneur’s incentive compatibility constraint). This is without loss of generality because any other deviation C > 0 is dominated.
MULTINATIONAL FIRMS AND IMPERFECT CAPITAL MARKETS
1183
monitoring (see Appendix II for details): (3)
˜ = − δ (C)
βpH − pL . (1 − γ )βpH
Straightforward differentiation of (3) together with the convexity of the function δ (·) produces the following result: LEMMA 1. The amount of monitoring C˜ is decreasing in both investor protection γ in Foreign and in the inventor’s shadow value of cash β. The effect of investor protection on monitoring is intuitive. Given our specification of the private benefit function B (·) in (1), the marginal benefit from monitoring is larger, the less developed is the financial system in Foreign (the lower is γ ). Because the marginal cost of monitoring is independent of γ , C˜ and γ are negatively correlated in the optimal contract. The effect of the shadow value of cash β on monitoring is a bit subtler. The intuition behind the result lies in the fact that the larger that β is, the larger is the opportunity cost of remunerating the inventor through ex-post dividends rather than through an ex-ante lump-sum transfer. Because of the tight mapping between φ˜ I and C˜ imposed by the incentive compatibility constraint in (v), we have that a larger β is also associated with a higher shadow cost of monitoring and hence with a lower optimal amount of monitoring. In light of (2), it is clear that our theory has implications for the share of equity held by the inventor that relate closely to the implications for monitoring. In particular, φ˜ I is proportional to the level of monitoring C˜ and thus is affected by the parameters γ and β in the same way as is monitoring. This reflects that equity shares emerge in our model as incentives for the inventor to monitor the foreign entrepreneur. As a result, we can establish the following proposition. PROPOSITION 1. The share of equity held by the inventor is decreasing both in investor protection γ in Foreign and in the inventor’s shadow value of cash β. An immediate corollary of this result follows. COROLLARY 1. Suppose that a transaction is recorded as an FDI transaction only if φ˜ I ≥ φ I .Then, there exists a threshold
1184
QUARTERLY JOURNAL OF ECONOMICS
investor protection γ ∗ ∈ 0, 1 such that the optimal contract entails FDI only if γ < γ ∗ . Our theory thus predicts that the prevalence of FDI in a given country should, other things equal, be a decreasing function of the level of investor protection in that country. Manipulation of the first-order conditions of program (P1) also delivers an implicit function of the level of investment as a ˜ function of parameters and the optimal level of monitoring C: (4)
R (x˜ ) =
1 ˜ . ˜ βpH − pL C (1 − γ ) δ C − 1− pH − pL pH − pL βpH
pH
Equation (4) implicitly defines the level of expected sales by the firm, that is, pH R (x˜ ). Differentiating this equation with respect to γ and β, we obtain the following proposition (see Appendix III for details). PROPOSITION 2. Output and cash flows in Foreign are increasing in investor protection γ in Foreign and decreasing in the inventor’s shadow value of cash β. The intuition for the effect of investor protection is straightforward. Despite the fact that the inventor’s monitoring reduces financial frictions, both the foreign entrepreneur’s compensation, as dictated by his incentive compatibility constraint (iv), and monitoring costs are increasing in the scale of operation. In countries with weaker investor protections, the perceived marginal cost of investment is higher, thus reducing equilibrium levels of investment. Using constraints (i), (ii), and (iv), one can also obtain the terms of the optimal financial contract with external investors in terms of C˜ and x: ˜ ˜ C˜ (1 − γ ) δ(C) − , φ˜ E = 1 − pH − pL pH − pL ˜ E˜ = pH φ˜ E R(x). Finally, straightforward manipulation delivers an optimal lumpsum date-0 transfer equal to R (x˜ ) pL ˜ ˜ − 1 x. ˜ C R (x˜ ) − F= β ( pH − pL) R (x˜ ) x˜
MULTINATIONAL FIRMS AND IMPERFECT CAPITAL MARKETS
1185
Depending on parameter values, the lump-sum transfer can be positive or negative, and it also varies nonmonotonically with the parameters of the model. We can, however, derive sharper predictions for the share of financing that is provided by the inventor. To see this, focus on the case in which the date-0 payment F˜ is positive and can be interpreted as the level of cofinancing by the inventor. The share of investment financed by the inventor is then given by (5)
pL 1 − α (x˜ ) R (x˜ ) F˜ = − , C˜ x˜ β ( pH − pL) x˜ α (x˜ )
where α(x) ≡ x R (x)/R(x) is the elasticity of revenue to output. As mentioned earlier, when preferences feature a constant elasticity of substitution across a continuum of differentiated goods produced by different firms, α(x) is independent of x, and R(x) can be written as R(x) = Ax α , where A > 0 and α ∈ (0, 1). Notice that the first term in (5) is increasing in C˜ and decreasing in x˜ due to the concavity of R(x). It thus follows from Lemma 1 and Proposition 2 that this first term is necessarily decreasing in γ . As for the second term, it will increase or decrease in x˜ depending on the properties of α(x). In most applications, α(x) will be either independent of x or decreasing in x (e.g., when the firm faces a linear demand function). In those situations, the second term in (5) will also be decreasing in γ and we can conclude the following. ˜ is sufficiently small, the share PROPOSITION 3. Provided that α (x) ˜ x) of inventor financing in total financing ( F/ ˜ is decreasing in investor protection γ . The intuition behind the result is that monitoring by inventors has a relatively high marginal product in countries with weak financial institutions. To induce the inventor to monitor, the optimal contract specifies a relatively steeper payment schedule, with a relatively higher contribution by the inventor at date 0 (a higher ˜ x) F/ ˜ in anticipation of a higher share of the cash flows generated by the project at date 2 (a higher φ˜ I ).15 ˜ x˜ is ambiguous. A 15. The effect of the shadow value of cash on the ratio F/ larger β is associated with a lower monitoring level C˜ (Lemma 1), but also with a lower level of x˜ and thus a higher ratio R (x˜ ) /x˜ (Proposition 2). In addition, β has an additional direct negative effect on the ratio. The overall effect is, in general, ambiguous.
1186
QUARTERLY JOURNAL OF ECONOMICS
The fact that the monitoring provided by the inventor is unverifiable by third parties is central to our theory of FDI. In particular, if monitoring was verifiable (and thus contractible), the optimal contract analogous to the one described above would immediately imply an equity share for the inventor equal to zero (see ` Desai, and Foley [2007] for details). Hence, the inventor Antras, would never choose to deploy her technology abroad through FDI. Instead, the inventor would sell the technology to the foreign entrepreneur in exchange for a positive lump-sum fee (and, hence, the inventor would never cofinance the project). In Section IV, we present empirical tests of Propositions 1, 2, and 3, and Corollary 1. Appendix I shows that Propositions 1, 2, and 3 continue to hold in a multicountry version of the model in which the statements apply to cross-sectional variation in investor protections. Our empirical tests exploit variation in the location of affiliates of U.S. MNCs and analyze the effect of investor pro˜ x. ˜ We identify the inventor in tections on proxies for x, ˜ φ˜ I , and F/ the model as being a parent firm and control for other parameters of the model, such as the shadow value of cash β, the concavity of R(x), the monitoring function δ (C ), and the probabilities pH and pL by using fixed effects for each firm in each year and controlling for a wide range of host-country variables. Because our estimation employs parent-firm fixed effects, we are not able to test the predictions regarding the effect of β in Propositions 1, 2, and 3. II.C. Generalizations Before proceeding to the empirical analysis, it is useful to consider the robustness of these results to more general environments. In particular, we consider the degree to which revenue sharing might substitute for equity contracts, the possibility that private benefits and monitoring costs may be proportional to x rather than R(x), and the effects of introducing productive monitoring by external investors. The underlying analysis is provided in Appendix IV. In the model above, we assume that when the project fails, it delivers a level of revenue equal to zero. Such an assumption greatly enhances tractability but suggests that revenue-sharing contracts may provide benefits similar to equity arrangements. This is problematic because it blurs the mapping between φ I in the model and equity shares in the data. More generally, however, revenue-sharing contracts are not optimal when the project delivers a positive level of revenue in case of failure. In fact, a simple
MULTINATIONAL FIRMS AND IMPERFECT CAPITAL MARKETS
1187
contract in which external investors are issued secured (or riskfree) debt and the inventor and entrepreneur take equity stakes is optimal.16 To see the intuition for this, consider the same setup as in Section II.A, but now assume that, when the project does not perform, it yields a level of revenue equal to R(x) ∈ (0, R(x)). As is standard in moral hazard problems with risk-neutral agents, the optimal contract calls for the agents undertaking unobservable actions (e.g., effort decisions) to be maximally punished (subject to the limited-liability constraint) whenever a failure of the project is observed. In our particular setup, this would imply that the optimal contract yields both the inventor and the entrepreneur a payoff of zero whenever a project failure is observed. The entire revenue stream R(x) should accrue to external investors. A straightforward way to implement such a contract is for the entrepreneur to issue an amount of secure debt equal to R(x) to external investors and for the inventor and entrepreneur to be equity holders. Once the debt is paid, the inventor and entrepreneur receive a share of zero in case of project failure and a share of the amount R(x) − R(x) in case of project success. The determination of their optimal shares is analogous to that in Section II.A with R(x) − R(x) replacing R(x) (details available upon request). In this more general setting, it is not possible to implement this optimal allocation of payoffs across agents through simple revenue-sharing arrangements. As such, the model can explain why an instrument with the features of equity tends to dominate both fixed-fee and revenue-sharing contracts in financially underdeveloped countries. Such contracts will likely entail an inefficiently low punishment to the inventor when the project does not perform well. We next briefly discuss alternative formulations for the entrepreneur’s private benefit of misbehavior and the inventor’s private cost of monitoring. We have assumed above that these are proportional to the revenue generated by the project in case of success. If instead we specified them as being proportional to x, then the optimal equity share φ˜ I would be given by φ˜ I =
C˜ x˜ ( pH − pL) R (x˜ )
16. A contract in which an entrepreneur issues debt to external investors appears to have empirical validity because most capital provided to affiliates from local sources takes the form of debt.
1188
QUARTERLY JOURNAL OF ECONOMICS
and would also depend on x˜ and the concavity of the R(x) ˜ function. One may thus worry that for a sufficiently concave R(x) ˜ function, it could be the case that equity stakes could be low in low-γ countries on account of the low values of x/R ˜ (x˜ ) in those countries. We show in Appendix IV, however, that our main comparative static concerning equity shares holds as long as the elasticity of revenue with respect to output—that is, α (x˜ ) ≡ x˜ R (x˜ ) /R (x˜ )—does not increase in x˜ too quickly. The required condition is analogous to that in Proposition 3 and is satisfied for the case of constant price elasticity and linear demand functions. We finally consider the possibility that local external investors (e.g., banks) also provide useful monitoring, the productivity of which may also be higher in countries with worse investor protections. In Appendix IV, we develop an extension of the model that incorporates monitoring by external investors and that, as with the monitoring by the inventor, is subject to declining marginal benefits. Although the optimal contract is now more complicated, we show that the incentive compatibility constraint for the inventor will continue to bind in equilibrium, implying that the inventor’s equity stake moves proportionally with its level of monitoring. Furthermore, provided that the level of investor protection γ is sufficiently high, the analysis remains qualitatively unaltered by the introduction of local monitoring. The reason for this is that, for large values of γ , the optimal contract already allocates equity stakes φ E to external investors that are large enough to induce them to monitor the entrepreneur.17 As a result, although certain details of the optimal contract change with the possibility of local monitoring, the comparative static results derived in Section II.B continue to hold in this more general model, provided that γ is sufficiently large. III. DATA AND DESCRIPTIVE STATISTICS The empirical work presented in Section IV is based on the most comprehensive available data on the activities of U.S. MNCs and on arm’s length technology transfers by U.S. firms. The Bureau of Economic Analysis (BEA) annual survey of U.S. Direct 17. When the level of investor protection is below a certain threshold, then the incentive compatibility for external investors becomes binding, in which case the analysis becomes more complicated. Without imposing particular functional forms on the monitoring functions, it becomes impossible to derive sharp comparative static results (see Appendix IV).
MULTINATIONAL FIRMS AND IMPERFECT CAPITAL MARKETS
1189
Investment Abroad from 1982 through 1999 provides a panel of data on the financial and operating characteristics of U.S. firms operating abroad. U.S. direct investment abroad is defined as the direct or indirect ownership or control by a single U.S. legal entity of at least 10% of the voting securities of an incorporated foreign business enterprise or the equivalent interest in an unincorporated foreign business enterprise. A U.S. multinational entity is the combination of a single U.S. legal entity that has made the direct investment, called the U.S. parent, and at least one foreign business enterprise, called the foreign affiliate.18 The most extensive data for the period examined in this study are available for 1982, 1989, 1994, and 1999 when BEA conducted Benchmark Surveys. Accordingly, the analysis is restricted to benchmark years except when the annual frequency of the data is critical—in the analysis of scale in Section IV.C that uses the liberalizations of ownership restrictions. To analyze arm’s length technology transfers, measures of royalty payments, licensing fees, and other payments for intangible assets received by U.S. firms from unaffiliated foreign persons are drawn from the results of BEA’s annual BE-93 survey. This survey requires that all firms receiving payments above certain thresholds report, regardless of whether the firm is a multinational.19 Table I provides descriptive statistics for the variables used in the analysis employing the benchmark year data (Panel A) and analysis employing the full panel (Panel B). Implementing empirical tests requires mapping the variables of the model to reasonable proxies in the data. To investigate the choice of an inventor to engage in arm’s length technology transfer or FDI (Corollary 1), the analysis uses a dummy variable that is equal to 1 if a U.S. firm receives an arm’s length royalty payment from a country in a given year and 0 if that firm only serves the country through affiliate activity in a particular year. For Proposition 3’s predictions on the share of inventor financing in ˜ x), total financing ( F/ ˜ a variable is defined as the ratio of the sum of parent-provided equity and debt to affiliate assets. Specifically, this share is the ratio of the sum of parent-provided equity and 18. Coverage and methods of the BEA survey are described in Desai, Foley, and Hines (2002). The survey covers all countries and industries, classifying affiliates into industries that are roughly equivalent to three-digit SIC code industries. As a result of confidentiality assurances and penalties for noncompliance, BEA believes that coverage is close to complete and levels of accuracy are high. 19. Because these data have been collected since 1986, data used in the analysis of arm’s length technology transfers cover only 1989, 1994, and 1999.
1190
QUARTERLY JOURNAL OF ECONOMICS TABLE I DESCRIPTIVE STATISTICS
Mean
Standard deviation
A: Benchmark year data for Tables II–V Multinational firm variables Arm’s length technology transfer dummy 0.2552 Share of affiliate assets financed by parent 0.4146 Share of affiliate equity owned by parent 0.8991 Log of affiliate sales 9.9024 Log of affiliate employment 4.7601 Affiliate net PPE/assets 0.2355 Log of parent R&D 9.0580
0.4360 0.3267 0.2195 1.7218 1.6060 0.2264 4.3927
Country variables Creditor rights Private credit FDI ownership restrictions Workforce schooling Log of GDP Log of GDP per capita Corporate tax rate Patent protections Property rights Rule of law Risk of expropriation
2.1415 0.7536 0.2247 8.1385 26.8002 9.3995 0.3488 3.2287 4.3767 9.3207 5.1398
1.2100 0.3891 0.4174 2.1739 1.4252 1.1019 0.1060 0.8480 0.8378 1.4088 1.2731
B: Annual data for Table V Log of affiliate sales 10.1285 Log of aggregate affiliate sales 15.7572
2.1426 1.7018
Notes. Panel A provides descriptive statistics for data drawn from the benchmark year survey and used in the analysis presented in Tables II–V. Arm’s length technology transfer dummy is defined for country-year pairs in which a parent has an affiliate or from which a parent receives a royalty payment from an unaffiliated foreign person. This dummy is equal to 1 if the parent receives a royalty payment from an unaffiliated foreign person, and it is otherwise equal to 0. Share of affiliate assets financed by parents is the ratio of parentprovided equity and net parent lending to total affiliate assets. Share of affiliate equity ownership is the equity ownership of the multinational parent. Affiliate net PPE/assets is the ratio of affiliate net property plant and equipment to affiliate assets. Creditor rights is an index of the strength of creditor rights developed in Djankov, McLiesh, and Shleifer (2007); higher levels of the measure indicate stronger legal protections. ¨ ¸Private credit is the ratio of private credit lent by deposit money banks to GDP, as provided in Beck, Demirguc Kunt, and Levine (1999). FDI ownership restrictions is a dummy equal to 0 if two measures of restrictions on foreign ownership as measured by Shatz (2000) are above 3 on a scale of 1 to 5 and 1 otherwise. Workforce schooling is the average schooling years in the population over age 25 years provided in Barro and Lee (2000). Corporate tax rate is the median effective tax rate paid by affiliates in a particular country and year. Patent protections is an index of the strength of patent rights provided in Ginarte and Park (1997). Property rights is an index of the strength of property rights drawn from the 1996 Index of Economic Freedom. Rule of law is an assessment of the strength and impartiality of a country’s overall legal system drawn from the International Country Risk Guide. Risk of expropriation is an index of the risk of outright confiscation or forced nationalization of private enterprise, and it is also drawn from the International Country Risk Guide; higher values of this index reflect lower risks. Panel B provides descriptive statistics for annual data covering the 1982–1999 period that are used in the analysis presented in Table V. Log of aggregate affiliate sales is the log of affiliate sales summed across affiliates in a particular country and year.
MULTINATIONAL FIRMS AND IMPERFECT CAPITAL MARKETS
1191
net borrowing by affiliates from the parent to affiliate assets.20 Proposition 1 considers the determinants of the share of equity held by the inventor, φ I , and this variable is measured in the data as the share of affiliate equity owned by the multinational parent. The log of affiliate sales is used to test Proposition 2’s predictions on the scale of affiliate activity. Table I also provides descriptive statistics for a number of other variables. Two measures of investor protections and capital market development are used in the analysis below. Because the model emphasizes the decisions of local lenders, the first measure is creditor rights. This measure is drawn from Djankov, McLiesh, and Shleifer (2007), which extends the sample studied by La Porta et al. (1998) to cover a broader sample of countries over the 1982–1999 period on an annual basis. Creditor rights is an index taking values between 0 and 4 and measures the extent to which the legal system constrains managers from diverting value away from creditors (as a large γ does in the model).21 The second measure is the annual ratio of private credit provided by deposit money banks and other financial institutions ¨ ¸ -Kunt, and Levine to GDP, and it is drawn from Beck, Demirguc (1999). This measure has the advantage of being an objective, continuous measure of the lending environment that captures the willingness of lenders to provide credit in response to investor protections.22 Measures of capital market development are correlated with other measures of economic and institutional development that could affect the outcome studied in ways not considered in the 20. In the model, we have interpreted all sources of financing as equity financing, but as explained in footnote 12, our setup is not rich enough to distinguish equity financing from debt financing. Hence, our empirical tests of Proposition 5 include both. 21. Specifically, the measure is an index formed by adding 1 when (1) the country imposes restrictions, such as creditors’ consent or minimum dividends to file for reorganization; (2) secured creditors are able to gain possession of their security once the reorganization petition has been approved (no automatic stay); (3) secured creditors are ranked first in the distribution of the proceeds that result from the disposition of the assets of a bankrupt firm; and (4) the debtor does not retain the administration of its property pending the resolution of the reorganization. 22. It is possible to employ a measure of shareholder rights to measure investor protections. Creditor rights and private credit are used to measure investor protections for several reasons. First, shareholder rights are only available for a single year near the end of our sample. Second, in our data, there is very little local ownership of affiliate equity, but affiliates do make extensive use of debt borrowed from local sources. As such, using creditor rights and private credit allows us to capitalize on some time-series variation in investor protections and more closely corresponds empirically to the financing choices of affiliates.
1192
QUARTERLY JOURNAL OF ECONOMICS
model. Therefore, the regressions control for several measures of economic and institutional variation that might otherwise obscure the analysis. The baseline specifications include controls for FDI ownership restrictions, human capital development, the log of GDP, the log of GDP per capita, and corporate tax rates. A number of countries impose restrictions on the extent to which foreign firms can own local ones. Shatz (2000) documents these restrictions using two distinct measures that capture restrictions on greenfield FDI and cross-border mergers and acquisition activity. The FDI ownership restriction dummy used below is equal to 1 if either of these measures is below 3 and 0 otherwise. Countries with more human capital, with more economic activity, or with a higher level of economic development may be more able to use technology obtained through an arm’s length transfer, and affiliates in these countries may exhibit distinctive financing patterns that reflect the quality of local entrepreneurs as opposed to financial market conditions. To address these possibilities, the specifications include workforce schooling, which measures the average schooling years in the population over 25 years old and is provided in Barro and Lee (2000). Data on the log of GDP and the log of GDP per capita come from the World Development Indicators. Firms could avoid local production or alter their financing patterns in response to tax considerations. Corporate tax rates are imputed from the BEA data by taking the median tax rate paid by affiliates that report positive net income in a particular country and year. Several other controls appear in additional specifications. Firms might choose to deploy technology through affiliate activity as opposed to through an arm’s length transfer, and they might select higher levels of ownership if they fear expropriation by local entrepreneurs (see, for instance, Ethier and Markusen [1986] for a theoretical treatment). Ginarte and Park (1997) provide a measure of the strength of patent protections, and the Index of Economic Freedom provides a measure of more general property rights. Rule of law is an assessment of the strength and impartiality of a country’s legal system, and it is drawn from the International Country Risk Guide (ICRG). Additionally, firms might fear expropriation by foreign governments and might limit foreign activity and make more extensive use of local financing in response. The ICRG also provides an index of the risk of outright
MULTINATIONAL FIRMS AND IMPERFECT CAPITAL MARKETS
1193
confiscation or forced nationalization faced by foreign investors. For this measure, higher values indicate lower risks.23 Because the BEA data are a panel measuring activity of individual firms in different countries, they allow for the inclusion of effects for each firm in each year, which we refer to as parent-year fixed effects. These fixed effects help control for other parameters of the model that are likely to be specific to particular firms at particular points in time, such as the shadow value of cash β, the concavity of R(x), the monitoring function δ (C ), and the probabilities pH and pL. The inclusion of these fixed effects implies that the effects of investor protections are identified from within-firm variation in the characteristics of countries in which the firm is active. Although the comprehensive data on MNCs and arm’s length technology transfers do offer a number of advantages, it is worth noting one significant oversight. Neither the model nor the empirical work considers situations in which a firm neither invests in nor transfers technology to a particular location. IV. EMPIRICAL RESULTS Each of the analyses below is composed of a descriptive figure and firm-level regressions. The figures provide a transparent and intuitive perspective on the predictions, and the regressions allow for a full set of controls and subtler tests emphasizing the role of technology intensity. The predictions on the use of arm’s length technology transfers and the financing and ownership of foreign affiliates are investigated by pooling cross sections from the benchmark years. Investigating the effect on scale requires an alternative setup because controlling for the many unobservable characteristics that might determine firm size is problematic. Fortunately, the model provides a stark prediction with respect to scale that can be tested by analyzing responses to the easing of ownership restrictions. IV.A. Arm’s Length Technology Transfer Decisions Figure I provides a preliminary perspective on the extent to which firms deploy technology through arm’s length transfers 23. Some country-level measures of economic and institutional development are highly correlated. The multicollinearity of these variables might cause the standard errors of our key estimates to be large. However, these coefficient estimates remain unbiased.
1194
QUARTERLY JOURNAL OF ECONOMICS
FIGURE I Arm’s Length Technology Transfer versus Direct Investment Arm’s length technology transfer share is the ratio of the number of firms that receive a royalty from an unaffiliated foreign person to the sum of the number of such firms and those firms that have an affiliate in a particular country and year. These shares are averaged by quintiles of private credit. Private credit varies by year and is the ratio of private credit lent by deposit money banks to GDP, as ¨ ¸ -Kunt, and Levine (1999). Higher number quintiles provided in Beck, Demirguc relate to higher values of private credit.
relative to direct ownership, across different quintiles of the private credit measure of investor protections. The propensity to use arm’s length technology transfer is computed at the country-year level as the ratio of the number of firms that receive a royalty payment from an unaffiliated foreign person to the number of firms that either receive such an arm’s length royalty or have an affiliate. The average arm’s length royalty share is 0.11 for the lowest private credit quintile of observations while it is 0.27 for the highest quintile. As predicted by the theory, firms appear to make greater use of arm’s length technology transfers, relative to direct ownership, to access countries with more developed capital markets. Table II further explores arm’s length technology transfers through specifications that include various controls and incorporate subtler tests. The dependent variable in these tests, the arm’s length technology transfer dummy, is defined for country-year pairs in which a firm either has an affiliate or receives a royalty payment from an unaffiliated foreign person. The dummy is equal to 1 if the firm receives a royalty payment from an unaffiliated
MULTINATIONAL FIRMS AND IMPERFECT CAPITAL MARKETS
1195
TABLE II ARM’S LENGTH TECHNOLOGY TRANSFER VERSUS DIRECT INVESTMENT Dependent variable: Arm’s length technology transfer dummy
Creditor rights
(1)
(2)
(3)
0.0086 (0.0022)
0.0131 (0.0026)
0.0023 (0.0039) 0.0016 (0.0005)
Creditor rights* log of parent R&D Private credit Private credit* log of parent R&D FDI ownership restrictions Workforce schooling
(6)
0.0273 (0.0129)
0.0295 (0.0147)
−0.0606 (0.0133) 0.0117 (0.0020) 0.0020 (0.0097) 0.0103 (0.0024) 0.0243 (0.0039) −0.0223 (0.0077) 0.1588 (0.0438) 0.0158 (0.0066) −0.0086 (0.0066) −0.0040 (0.0046) −0.0080 (0.0055) −0.3923 (0.0968) Y 29,238 .6105
−0.0001 (0.0093) 0.0066 (0.0025) 0.0212 (0.0031) −0.0179 (0.0047) 0.1367 (0.0434)
−0.3780 (0.0840) Y
0.0079 (0.0094) 0.0134 (0.0020) 0.0268 (0.0038) −0.0144 (0.0066) 0.1777 (0.0453) 0.0127 (0.0052) −0.0254 (0.0067) −0.0022 (0.0043) −0.0080 (0.0049) −0.5162 (0.0979) Y
−0.2839 (0.0807) Y
0.0017 (0.0087) 0.0097 (0.0021) 0.0216 (0.0035) −0.0182 (0.0067) 0.1363 (0.0393) 0.0155 (0.0057) −0.0095 (0.0057) −0.0025 (0.0043) −0.0082 (0.0050) −0.2638 (0.0843) Y
37,314 .7628
36,029 .7645
30,954 .6061
34,583 .7598
33,922 .7624
Property rights Rule of law Risk of expropriation
Parent-year fixed effects? No. of obs. R2
(5)
0.0063 (0.0085) 0.0122 (0.0018) 0.0239 (0.0033) −0.0129 (0.0058) 0.1583 (0.0413) 0.0124 (0.0046) −0.0227 (0.0061) −0.0013 (0.0039) −0.0076 (0.0044) −0.3569 (0.0822) Y
0.0098 (0.0089) 0.0069 (0.0019) Log of GDP 0.0238 (0.0032) Log of GDP per capita −0.0148 (0.0039) Corporate tax rate 0.1239 (0.0435) Patent protections
Constant
(4)
Notes. The dependent variable, the arm’s length technology transfer dummy, is defined for country-year pairs in which a parent has an affiliate or from which a parent receives a royalty payment from an unaffiliated foreign person. This dummy is equal to 1 if the parent receives a royalty payment from an unaffiliated foreign person, and it is otherwise equal to 0. Creditor rights is an index of the strength of creditor rights developed in Djankov, McLiesh, and Shleifer (2007); higher levels of the measure indicate stronger legal protections. Private ¨ ¸ -Kunt, credit is the ratio of private credit lent by deposit money banks to GDP, as provided in Beck, Demirguc and Levine (1999). FDI ownership restrictions is a dummy equal to 0 if two measures of restrictions on foreign ownership as measured by Shatz (2000) are above 3 on a scale of 1 to 5 and 1 otherwise. Workforce schooling is the average schooling years in the population over age 25 years provided in Barro and Lee (2000). Corporate tax rate is the median effective tax rate paid by affiliates in a particular country and year. Patent protections is an index of the strength of patent rights provided in Ginarte and Park (1997). Property rights is an index of the strength of property rights drawn from the 1996 Index of Economic Freedom. Rule of law is an assessment of the strength and impartiality of a country’s overall legal system drawn from the International Country Risk Guide. Risk of expropriation is an index of the risk of outright confiscation or forced nationalization of private enterprise, and it is also drawn from the International Country Risk Guide; higher values of this index reflect lower risks. Each specification is an OLS specification that includes parent-year fixed effects. Heteroscedasticity-consistent standard errors that correct for clustering at the country-year level appear in parentheses.
1196
QUARTERLY JOURNAL OF ECONOMICS
foreign person, and it is otherwise equal to 0. The inclusion of parent-year fixed effects controls for a variety of unobservable firm characteristics that might otherwise conflate the analysis. All specifications presented in the table also include a measure of the existence of foreign ownership restrictions, workforce schooling, the log of GDP, the log of GDP per capita, and host country tax rates.24 Standard errors are heteroscedasticity-consistent and are clustered at the country-year level. These specifications are linear probability models and are used in order to allow for both parent-year fixed effects and for clustering of standard errors at the country-year level.25 The coefficient on creditor rights in column (1) is positive and significant, affirming the prediction of Corollary 1 that firms are more likely to serve countries with higher levels of investor protections through arm’s length technology transfer as opposed to only through a foreign affiliate. The results also indicate that firms are more likely to engage in arm’s length technology transfer as opposed to just affiliate activity in countries that have a more educated workforce and that have higher corporate tax rates. The predictions of the model relate to credit market development, but the measure of creditor rights may be correlated with more general variation in the institutional environment. The specification presented in column (2) includes additional proxies for the quality of other host country institutions, including indices of patent protections, property rights, the strength and impartiality of the overall legal system, and the risk of expropriation as control variables. The coefficient on creditor rights remains positive and significant with the inclusion of these additional controls, implying that capital market conditions play an economically significant role relative to other host country institutions. The effect of a onestandard-deviation change in creditor rights is approximately one and a half times as large as the effect of a one-standard-deviation change in patent protections, which is also positive and significant in explaining the use of arm’s length technology transfer. 24. For the estimated effects of capital market conditions to be unbiased in this and the subsequent tests, these country characteristics must be exogenous to firms’ decisions to use arm’s length technology transfers as opposed to FDI, and firms’ financing and ownership decisions. 25. Given the limited time dimension of our data set, our linear specification avoids the incidental parameter problem inherent in the estimation of a large number of fixed effects. As a robustness check, these specifications have been run as conditional logit specifications. The resulting coefficients on the measures of financial development are of the same sign and statistical significance as those presented in Table II.
MULTINATIONAL FIRMS AND IMPERFECT CAPITAL MARKETS
1197
The specification presented in column (3) provides a subtler test of the model and the particular mechanism that gives rise to FDI as opposed to arm’s length technology transfer. The model implies that the relative value of monitoring should be more pronounced for firms that conduct more research and development (R&D) because these firms are more likely to be deploying novel technologies that require the unique monitoring ability of parents. Alternatively, firms with limited technological capabilities are less likely to be important to external funders as monitors, and the effects of capital market development on the choice to serve a country through arm’s length technology transfer or affiliate activity should be less pronounced for these kinds of firms. The specification presented in column (3) uses the log of parent R&D as a proxy for the degree to which firms are technologically advanced. Because this specification includes parent-year fixed effects, this variable does not enter on its own, but it is interacted with creditor rights.26 The positive coefficient on the interaction term is consistent with the prediction that the value of creating incentives to monitor through ownership in countries with weak financial development is highest for technologically advanced firms. The specifications presented in columns (4)–(6) of Table II repeat those presented in columns (1)–(3), replacing creditor rights with private credit as a measure of financial development. The positive and significant coefficients on private credit in columns (4) and (5) are consistent with the findings in columns (1) and (2) and illustrate that countries with higher levels of financial development are more likely to be served through arm’s length technology transfers as opposed to just affiliate activity. The positive and significant coefficient on private credit interacted with the log of parent R&D presented in column (6) indicates that the effects of capital markets are most pronounced for firms that are R&D intensive.27
26. Because parent characteristics are absorbed by the parent-year fixed effects, the coefficient on this interaction term picks up how the effect of capital market conditions varies across firms. The sample used in this specification and the specification in column (6) includes only MNCs because R&D expenditures are only available in the BEA data for these firms. 27. When running these specifications as conditional logit specifications, the resulting coefficients on these interaction terms are of the same sign and statistical significance as those in the Table II, except for the interaction of creditor rights and the log of parent R&D. The coefficient on this variable is positive, but it is not statistically different from zero at conventional levels.
1198
QUARTERLY JOURNAL OF ECONOMICS
FIGURE II Parent Financing of Affiliate Assets Parent financing share is the ratio of the sum of net borrowing from the parent and parent equity provisions to affiliate assets. The bars display medians of the country-level average shares for each quintile of private credit. Private credit is the ratio of private credit lent by deposit money banks to GDP, as provided ¨ ¸ -Kunt, and Levine (1999). Higher number quintiles relate to in Beck, Demirguc higher values of private credit.
IV.B. Financing and Ownership of Foreign Affiliates The analysis presented in Figure II and Table III investigates whether financing and ownership decisions reflect the mechanics emphasized in the model. As depicted in Figure II, parent firms provide financing that averages 45% of affiliate assets in countries in the lowest quintile of private credit but 38% of affiliate assets in countries from the highest quintile of private credit.28 Table III presents the results of more detailed tests of this relation. In addition to a variety of country-level controls, fixed effects for each parent in each year control for differences across firms. The negative and significant coefficient on creditor rights in column (1) 28. More specifically, the values displayed in this chart are computed by first taking average shares of affiliate assets financed by parents for each country in each year. These shares are defined as the ratio of the sum of net borrowing from the parent and parent equity provisions (including both paid-in-capital and retained earnings) to affiliate assets. The bars display medians of the country-level averages for each quintile of private credit.
MULTINATIONAL FIRMS AND IMPERFECT CAPITAL MARKETS
1199
TABLE III PARENT FINANCING DECISIONS Dependent variable: Share of affiliate assets financed by parent
Creditor rights
(1)
(2)
(3)
−0.0166 (0.0054)
−0.0164 (0.0051)
−0.0080 (0.0055) −0.0010 (0.0003)
Creditor rights* log of parent R&D Private credit Private credit* log of parent R&D FDI ownership restrictions Workforce schooling
−0.0406 (0.0146) 0.0200 (0.0057) Log of GDP −0.0224 (0.0055) Log of GDP per capita −0.0327 (0.0112) Corporate tax rate −0.1288 (0.0776) Patent protections
(6)
−0.0632 (0.0195)
−0.0384 (0.0215)
−0.0084 (0.0220) −0.0031 (0.0012) −0.0358 (0.0162) 0.0157 (0.0049) −0.0148 (0.0085) 0.0030 (0.0169) −0.1731 (0.0745) −0.0436 (0.0120) −0.0113 (0.0106) 0.0068 (0.0080) 0.0003 (0.0094) 0.8330 (0.1906) Y Y 38,016 .3134
−0.0323 (0.0171) 0.0199 (0.0060) −0.0157 (0.0062) −0.0285 (0.0136) −0.1135 (0.0743)
1.2571 (0.1083) Y
−0.0426 (0.0155) 0.0114 (0.0043) −0.0180 (0.0071) −0.0072 (0.0154) −0.2061 (0.0764) −0.0388 (0.0114) 0.0110 (0.0115) 0.0062 (0.0079) 0.0007 (0.0091) 0.9728 (0.1435) Y
1.0444 (0.1479) Y
−0.0358 (0.0160) 0.0151 (0.0048) −0.0148 (0.0084) 0.0027 (0.0167) −0.1803 (0.0742) −0.0434 (0.0119) −0.0096 (0.0103) 0.0065 (0.0080) 0.0009 (0.0092) 0.8389 (0.1868) Y
N 51,060 .3013
Y 41,232 .3105
Y 40,297 .3071
N 48,183 .3076
Y 38,911 .3167
Rule of law Risk of expropriation
Parent-year fixed effects? Affiliate controls? No. of obs. R2
(5)
−0.0426 (0.0154) 0.0110 (0.0042) −0.0180 (0.0070) −0.0066 (0.0153) −0.2135 (0.0763) −0.0392 (0.0113) 0.0112 (0.0112) 0.0059 (0.0078) 0.0010 (0.0090) 0.9710 (0.1420) Y
Property rights
Constant
(4)
Notes. The dependent variable is the ratio of parent-provided equity and net parent lending to total assets. Creditor rights is an index of the strength of creditor rights developed in Djankov, McLiesh, and Shleifer (2007); higher levels of the measure indicate stronger legal protections. Private credit is the ratio of ¨ ¸ -Kunt, and Levine (1999). private credit lent by deposit money banks to GDP, as provided in Beck, Demirguc FDI ownership restrictions is a dummy equal to 0 if two measures of restrictions on foreign ownership as measured by Shatz (2000) are above 3 on a scale of 1 to 5 and 1 otherwise. Workforce schooling is the average schooling years in the population over age 25 years provided in Barro and Lee (2000). Corporate tax rate is the median effective tax rate paid by affiliates in a particular country and year. Patent protections is an index of the strength of patent rights provided in Ginarte and Park (1997). Property rights is an index of the strength of property rights drawn from the 1996 Index of Economic Freedom. Rule of law is an assessment of the strength and impartiality of a country’s overall legal system drawn from the International Country Risk Guide. Risk of expropriation is an index of the risk of outright confiscation or forced nationalization of private enterprise, and it is also drawn from the International Country Risk Guide; higher values of this index reflect lower risks. Each specification is an OLS specification that includes parent-year fixed effects. As affiliate controls, the specifications presented in columns (2), (3), (5), and (6) include the log of affiliate sales, the log of affiliate employment, and affiliate net PPE/assets. Affiliate net PPE/assets is the ratio of affiliate net property plant and equipment to affiliate assets. Heteroscedasticity-consistent standard errors that correct for clustering at the country-year level appear in parentheses.
1200
QUARTERLY JOURNAL OF ECONOMICS
indicates that the share of affiliate assets financed by the parent is higher in countries that do not provide creditors with extensive legal protections. This result is consistent with the prediction contained in Proposition 3 and the pattern depicted in Figure II. The specification in column (2) includes the set of other institutional variables also used in Table II to ensure that proxies for financial development are not proxying for some other kind of institutional development. In addition, this specification also controls for affiliate characteristics that the corporate finance literature suggests might influence the availability of external capital. Harris and Raviv (1991) and Rajan and Zingales (1995) find that larger firms and firms with higher levels of tangible assets are more able to obtain external debt. Two proxies for affiliate size (the log of affiliate sales and the log of affiliate employment) and a proxy for the tangibility of affiliate assets (the ratio of affiliate net property, plant, and equipment to affiliate assets) are included.29 In the specification in column (2), the −0.0164 coefficient on creditor rights implies that the share of affiliate assets financed by the affiliate’s parent is 0.0327, or 7.9% of its mean value, higher for affiliates in countries in the 25th percentile of creditor rights relative to the 75th percentile of creditor rights. The negative and significant coefficient on FDI ownership restrictions is consistent with the hypothesis that such restrictions limit parent capital provisions, and the negative and significant coefficient on the log of GDP suggests that affiliates located in smaller markets are more reliant on their parents for capital. When affiliates borrow, they primarily borrow from external sources, and Desai, Foley, and Hines (2004b) show that affiliates borrow more in high-tax jurisdictions. These facts could explain the negative coefficient on the corporate tax rate in explaining the share of assets financed by the parent.30 Previous theoretical work stressing how concerns over technology expropriation might give rise to multinational activity does not make clear predictions concerning the share of affiliate assets financed by the parent, but it is worth noting that the indices of patent protection and property rights are negative in 29. The affiliate controls included in this specification as well as those in columns (3), (5), and (6) of Table III and columns (2), (3), (5), and (6) of Table IV are potentially endogenous. It is comforting that their inclusion does not typically have a material impact on the estimated effects of capital market conditions. 30. The model’s predictions relate to overall parent capital provision. As such, these specifications differ from the analysis in Desai, Foley, and Hines (2004b), where only borrowing decisions are analyzed.
MULTINATIONAL FIRMS AND IMPERFECT CAPITAL MARKETS
1201
the specification in column (2). None of the unreported coefficients on affiliate characteristics is significant. If parent financing creates incentives for monitoring and the effects of monitoring are strongest for firms with more technology, then the effects documented in column (2) should be most pronounced for R&D-intensive firms. The specification in column (3) tests for a differential effect of creditor rights on financing by including creditor rights interacted with the log of parent R&D. The negative and significant coefficient on this interaction term indicates that more technologically advanced firms finance a higher share of affiliate assets in countries with weak credit markets. This finding is not implied by many other intuitions for why investor protections might affect parental financing provisions. The specifications presented in columns (4)–(6) of Table III repeat the analysis presented in columns (1)–(3) substituting measures of private credit for creditor rights. In columns (4) and (5), the coefficient on private credit is negative, and it is significant in column (4) but only marginally significant in column (5). In the specification in column (6), the interaction of private credit and the log of parent R&D is significant. The results obtained when using private credit are also consistent with the prediction of Proposition 3 and with Figure II. The model also predicts that multinational parents should hold larger ownership stakes in affiliates located in countries with weak investor protections. Table IV presents results of using the share of affiliate equity owned by the parent as the dependent variable in specifications that are similar to those presented in Table III. Although parent equity shares are bounded between 0 and 1, and there is a large grouping of affiliates with equity that is 100% owned by a single parent firm, the specifications presented in Table IV are ordinary least squares models that include parent-year fixed effects and that allow standard errors to be clustered at the country-year level.31 In the specifications presented in columns (1), (2), (4), and (5), the proxy for credit market development is negative and significant. Parent companies own higher shares of affiliate equity when affiliates are located in countries where protections extended to creditors are weaker and private credit is scarcer, as predicted by the model. In the specifications 31. Wholly owned affiliates comprise 77.2% of all observations. These results are robust to using an alternative estimation technique. Conditional logit specifications that use a dependent variable that is equal to 1 for wholly owned affiliates and 0 for partially owned affiliates yield similar results.
1202
QUARTERLY JOURNAL OF ECONOMICS TABLE IV PARENT OWNERSHIP DECISIONS Dependent variable: Share of affiliate equity owned by parent
Creditor rights
(1)
(2)
(3)
−0.0091 (0.0028)
−0.0101 (0.0035)
−0.0010 (0.0031) −0.0010 (0.0003)
Creditor rights* log of parent R&D Private credit Private credit* log of parent R&D FDI ownership restrictions Workforce schooling
−0.0728 (0.0126) 0.0005 (0.0024) Log of GDP −0.0157 (0.0037) Log of GDP per capita 0.0309 (0.0064) Corporate tax rate −0.2633 (0.0638) Patent protections
(6)
−0.0506 (0.0135)
−0.0481 (0.0174)
0.0078 (0.0144) −0.0057 (0.0009) −0.0529 (0.0122) −0.0026 (0.0026) −0.0079 (0.0046) 0.0416 (0.0144) −0.3179 (0.0564) −0.0122 (0.0077) −0.0014 (0.0072) 0.0017 (0.0060) 0.0059 (0.0067) 0.9159 (0.1018) Y Y 38,198 .4217
−0.0622 (0.0117) 0.0007 (0.0026) −0.0110 (0.0035) 0.0381 (0.0078) −0.2778 (0.0584)
1.1593 (0.1006) Y
−0.0611 (0.0134) −0.0044 (0.0025) −0.0116 (0.0045) 0.0363 (0.0132) −0.3391 (0.0700) −0.0137 (0.0072) 0.0044 (0.0075) 0.0012 (0.0061) 0.0050 (0.0068) 1.0675 (0.1121) Y
0.9833 (0.0947) Y
−0.0560 (0.0122) −0.0030 (0.0026) −0.0079 (0.0046) 0.0402 (0.0143) −0.3249 (0.0582) −0.0127 (0.0078) 0.0000 (0.0071) 0.0009 (0.0061) 0.0069 (0.0066) 0.9356 (0.1055) Y
N 51,320 .3974
Y 41,436 .4250
Y 40,498 .4184
N 48,422 .3998
Y 39,096 .4275
Rule of law Risk of expropriation
Parent-year fixed effects? Affiliate controls? No. of obs. R2
(5)
−0.0637 (0.0133) −0.0049 (0.0024) −0.0116 (0.0046) 0.0358 (0.0132) −0.3456 (0.0712) −0.0142 (0.0073) 0.0055 (0.0072) 0.0005 (0.0061) 0.0054 (0.0068) 1.0774 (0.1147) Y
Property rights
Constant
(4)
Notes. The dependent variable is the share of affiliate equity owned by the affiliate’s parent. Creditor rights is an index of the strength of creditor rights developed in Djankov, McLiesh, and Shleifer (2007); higher levels of the measure indicate stronger legal protections. Private credit is the ratio of private credit lent ¨ ¸ -Kunt, and Levine (1999). FDI ownership by deposit money banks to GDP, as provided in Beck, Demirguc restrictions is a dummy equal to 0 if two measures of restrictions on foreign ownership as measured by Shatz (2000) are above 3 on a scale of 1 to 5 and 1 otherwise. Workforce schooling is the average schooling years in the population over age 25 years provided in Barro and Lee (2000). Corporate tax rate is the median effective tax rate paid by affiliates in a particular country and year. Patent protections is an index of the strength of patent rights provided in Ginarte and Park (1997). Property rights is an index of the strength of property rights drawn from the 1996 Index of Economic Freedom. Rule of law is an assessment of the strength and impartiality of a country’s overall legal system drawn from the International Country Risk Guide. Risk of expropriation is an index of the risk of outright confiscation or forced nationalization of private enterprise, and it is also drawn from the International Country Risk Guide; higher values of this index reflect lower risks. Each specification is an OLS specification that includes parent-year fixed effects. As affiliate controls, the specifications presented in columns (2), (3), (5), and (6) include the log of affiliate sales, the log of affiliate employment, and affiliate net PPE/assets. Affiliate net PPE/assets is the ratio of affiliate net property plant and equipment to affiliate assets. Heteroscedasticity-consistent standard errors that correct for clustering at the country-year level appear in parentheses.
MULTINATIONAL FIRMS AND IMPERFECT CAPITAL MARKETS
1203
presented in columns (3) and (6), the negative and significant coefficients on the interaction terms indicate that these results are also more pronounced for technologically advanced firms. The results in Table IV also indicate that equity ownership shares are lower in countries with ownership restrictions and countries with higher corporate tax rates. If equity ownership decisions placed strong emphasis on the protection of technology and ownership substituted for weak patent protections, the coefficient on the patent protections variable should be negative and significant. Although the estimated coefficient is negative, it is only marginally significant in some specifications. The results presented in Tables II, III, and IV are robust to a number of concerns. First, it is possible that the estimates of coefficients on capital market conditions interacted with the log of parent R&D may reflect the effect of similar interactions with alternative institutional variables. To consider this possibility, it is useful to consider the inclusion of other interaction terms. For example, when the log of parent R&D interacted with the rule of law index is included in the specifications presented in columns (3) and (6) of the three tables, the interactions that include proxies for capital market development remain significant in all of the tests. When the log of parent R&D interacted with the patent protection index is included in these specifications, the interactions featuring proxies for credit market development remain significant in all of the tests except for the one in column (3) of Table II. As such, it appears that the role of R&D intensity is most pronounced through the channel emphasized in the model, through interactions with capital market conditions. It may also be the case that the share of affiliate assets financed by the parent and parent ownership levels are lower for older affiliates and these affiliates may be more prevalent in countries with better investor protections. Including proxies for affiliate age in the specifications presented in Tables III and IV does not affect the results of interest.32 Similarly, the model does not explicitly consider the possibility that a firm exploits its technology through trade as opposed to through FDI or arm’s length technology transfers. To consider if trade channels could affect the main findings, the log of parent exports to unaffiliated foreign persons 32. The proxies for age are the number of years since an affiliate first reported data to BEA and a dummy equal to 1 if the affiliate first reported in 1982 and 0 otherwise.
1204
QUARTERLY JOURNAL OF ECONOMICS
in each country and year is included as a control in each of the specifications. The magnitude and significance of the coefficients on the proxies for capital market conditions and the interactions of these proxies and the log of parent R&D are not materially changed. Finally, contractual forms that are specific to the natural resources sector could affect some of the results. Removing observations of firms in this sector reduces the significance of the results on the effects of private credit in specifications presented in column (5) of Tables II and III, but does not materially change any of the other results on the effects of capital market conditions in Tables II, III, and IV.33 IV.C. Scale of Multinational Activity The model predicts that multinational activity will be of a larger scale in countries with stronger investor protections. Because there are many theories for the determinants of FDI activity, using specifications similar to those presented in Tables II, III, and IV to explore scale is problematic because it is difficult to include a set of controls sufficiently extensive to distinguish between alternative theories.34 Fortunately, a subtler prediction of the model allows for tests of scale effects. Specifically, the model suggests that the response to the liberalizations of ownership restrictions should be larger in host countries with weak investor protections. The intuition for this prediction is that in countries with weak investor protections, ownership restrictions are more likely to bind because ownership is most critical for maximizing the value of the enterprise in these settings. As such, the relaxation of an ownership constraint should have muted effects for affiliates in countries with deep capital markets and more pronounced effects for affiliates in countries with weaker investor protections.35 33. Firms in BEA industries 101–148 and 291–299 are dropped from the sample. The coefficient on private credit in column (5) of Table II, when estimated using the reduced sample, is 0.0292, and it has a t-statistic of 1.92. The coefficient on private credit in column (5) of Table III, when estimated using the reduced sample, is −0.0351, and it has a t-statistic of 1.62. ` Desai, and Foley (2007) presents the results of 34. Appendix Table I in Antras, such an exercise. Although the coefficients on both the creditor rights variables and private credit variables are usually positive in explaining the log of affiliate sales, as Proposition 2 predicts, none of the coefficients on these variables is significant. 35. This prediction can in fact be explicitly derived from the model. In particular, one can envision an ownership restriction as an additional constraint to program (P1), requiring that φ I ≤ φ I for some foreign ownership cap φ I ∈ R. One can then show (details available upon request) that (i) for large enough γ , this constraint will not bind, and thus a removal of the constraint will have no effects
MULTINATIONAL FIRMS AND IMPERFECT CAPITAL MARKETS
1205
FIGURE III Liberalizations and Multinational Firm Growth The two lines correspond to averages of an index computed at the country level as the ratio of aggregate affiliate sales in a given year to the level of sales in the year of the liberalization. Countries are split into two samples at the median level of private credit. Private credit is the ratio of private credit lent by deposit money ¨ ¸ -Kunt, and Levine (1999). banks to GDP, as provided in Beck, Demirguc
Figure III illustrates how the scale of multinational activity changes around the time of ownership liberalizations in countries with different levels of capital market development. Liberalizations are defined as the first year in which the FDI ownership restriction dummy described above changes from 1 to 0.36 The
on affiliate activity; (ii) when the constraint binds, the level of affiliate activity x is lower than in the absence of the constraint; and (iii) a marginal increase in φ I (i.e., a relaxation of the restriction) increases x by more, the lower is γ . Hence, the response of affiliate activity to a removal of ownership restrictions will be relatively larger in countries with relatively weaker investor protections. 36. The countries experiencing a liberalization are Argentina (1990), Australia (1987), Colombia (1992), Ecuador (1991), Finland (1990), Honduras (1993), Japan (1993), Malaysia (1987), Mexico (1990), Norway (1995), Peru (1992), Philippines (1992), Portugal (1987), Sweden (1992), Trinidad and Tobago (1994), and Venezuela (1990). Because control variables measuring the development of institutions other than credit markets do not vary much (if at all) through time and are unavailable for six of the sixteen reforming countries, these controls are not included in the analysis of liberalizations. The affiliate fixed effects implicitly control for time-invariant country characteristics, and so this is unlikely to pose a significant problem.
1206
QUARTERLY JOURNAL OF ECONOMICS TABLE V LIBERALIZATIONS AND MULTINATIONAL FIRM SCALE Dependent variable Log of affiliate sales
Post-liberalization dummy Post-liberalization dummy * low creditor rights dummy Post-liberalization dummy * low private credit dummy Log of GDP
Log of aggregate affiliate sales
(1)
(2)
(3)
(4)
0.0016 (0.0684) 0.3011 (0.0827)
−0.0073 (0.0712)
−0.0633 (0.1230) 0.3682 (0.1552)
−0.1049 (0.1262)
0.3886 (0.3888) Log of GDP per capita 1.3675 (0.3720) Constant −13.5818 (9.2414) Affiliate and year fixed effects? Y Country and year fixed effects? N No. of obs. 180,796 R2 .8035
0.2947 (0.0899) 0.3409 (0.3960) 1.4488 (0.3867) −13.0613 (9.2484) Y N 181,103 .8040
−0.0786 (0.7833) 2.6620 (0.5425) −4.7847 (22.1876) N Y 827 .9243
0.3812 (0.1769) −0.1351 (0.7040) 2.8376 (0.6192) −4.9033 (20.0397) N Y 845 .9251
Notes. The dependent variable in the first two columns is the log of affiliate sales, and the dependent variable in the last two columns is the log of affiliate sales aggregated across affiliates in a particular country. The data are annual data covering the 1982–1999 period. The post-liberalization dummy is equal to 1 for the sixteen countries that liberalize their ownership restrictions in the year of and years following liberalization of foreign ownership restrictions. The low creditor rights dummy is equal to 1 for observations related to countries with below median levels of creditor rights among liberalizing countries measured in the year prior to liberalization and 0 otherwise. The low private credit dummy is equal to 1 for observations related to countries with below median levels of private credit among liberalizing countries measured in the year prior to liberalization and 0 otherwise. Creditor rights is an index of the strength of creditor rights developed in Djankov, McLiesh, and Shleifer (2007). Private credit is the ratio of private credit lent by deposit money ¨ ¸ -Kunt, and Levine (1999). The first two specifications are OLS banks to GDP, as provided in Beck, Demirguc specifications that include affiliate and year fixed effects, and the last two are OLS specifications that include country and year fixed effects. Heteroscedasticity-consistent standard errors that correct for clustering at the country level appear in parentheses.
lines trace out an index that is computed by calculating the ratio of aggregate affiliate sales in a particular country and year to the value of aggregate affiliate sales in that country in the year of liberalization. The line demarcated by squares (triangles) plots the average of this index across countries that have a measure of private credit in the year prior to the liberalization that is equal to or less than (above) the median private credit of liberalizing countries. The lines indicate that affiliate activity increases by a larger margin in countries with low levels of private credit following liberalizations. The specifications presented in Table V investigate whether these differential effects are robust. The dependent variable in
MULTINATIONAL FIRMS AND IMPERFECT CAPITAL MARKETS
1207
columns (1) and (2) is the log value of affiliate sales, and the sample consists of the full panel from 1982 to 1999. Given the limited data requirements of these specifications (relative to the variables investigated in Tables II, III, and IV) and the desire to investigate changes within affiliates, the full panel provides a more appropriate setting for these tests. These specifications include affiliate and year fixed effects, and the standard errors are clustered at the country level. The sample includes all countries so that affiliate activity in countries that do not liberalize helps identify the year effects and the coefficients on the income variables. The results are robust to using a sample drawn only from reforming countries. The specifications in columns (1) and (2) include controls for log GDP, log GDP per capita, and the post-liberalization dummy. The coefficient on log GDP per capita is positive and significant indicating that rising incomes are associated with larger levels of affiliate activity. The coefficient of interest in column (1) is the coefficient on the interaction of the post-liberalization dummy and a dummy that is equal to 1 if the country has a value of the creditor rights index in the year before liberalization that is equal to or less than the median value for liberalizing countries. The positive and significant coefficient indicates that affiliates in weak-creditorrights countries grow quickly after liberalizations. The coefficient on the post-liberalization dummy on its own indicates that the effect of liberalizations is negligible and statistically insignificant for affiliates in high-creditor-rights countries. In column (2), these same results are obtained when the measure of private credit is used as the proxy for financial development. At the affiliate level, the model’s predictions regarding how the scale of activity relates to capital market depth are validated using tests that, through the use of affiliate fixed effects and the emphasis on the interaction term, are difficult to reconcile with alternative theories. It is possible that the results presented in columns (1) and (2) inaccurately capture the effects of the liberalizations because they only measure activity on the intensive margin and fail to capture responses on the extensive margin. Entry or exit might accompany liberalizations and might amplify or dampen these results. Figure III suggests this is not the case because it is constructed using data aggregated to the country level. The specifications provided in columns (3) and (4) employ a dependent variable that is the log value of the aggregate value of all sales of U.S. multinational affiliates within a country-year cell. These specifications substitute country fixed effects for affiliate fixed effects but are otherwise similar to the regressions provided in columns (1) and (2).
1208
QUARTERLY JOURNAL OF ECONOMICS
In column (3), the coefficient on the interaction term including the creditor rights variable is again positive and significant, indicating that incorporating activity on the extensive margin does not appear to contradict the earlier result. In column (4), the coefficient on the interaction term is again positive and significant. Taken together, the results suggest that the scale of activity is positively related to the quality of investor protections and capital market development, and these results persist when incorporating the effects of entry and exit. V. CONCLUSION Efforts to understand patterns of MNC activity have typically emphasized aspects of technology expropriation rather than the constraints imposed by weak investor protections and shallow capital markets. In the prior literature, MNCs arise because of the risk of a partner expropriating a proprietary technology. In the model presented in this paper, the exploitation of technology is central to understanding MNC activity, but the critical constraint is the nature of capital market development and investor protections in host countries. Entrepreneurs must raise capital to fund projects, and external investors are aware of the possibility that these entrepreneurs might behave opportunistically. Inventors can alleviate financial frictions because they have privileged knowledge of their technology and can thus play a role in monitoring entrepreneurs. As a result, MNC activity and capital flows arise endogenously to ensure that monitoring occurs. External investors demand higher levels of multinational parent firm financial participation in countries with weak investor protections. By placing financial frictions at the center of understanding patterns of activity and flows, the model delivers novel predictions about the use of arm’s length technology transfers and about the financial and investment decisions of MNCs that are validated in firm-level analysis. The use of arm’s length technology transfers is more common in countries with strong investor protections and deep capital markets. Previous findings that FDI flows to developing countries are limited reflect two opposing forces. Weak investor protections and shallow capital markets limit the efficient scale of enterprise but also result in greater parent provision of capital and more parent ownership of affiliate equity. The effects of the institutional setting are more pronounced for R&D-intensive firms because parental monitoring is particularly valuable for the
MULTINATIONAL FIRMS AND IMPERFECT CAPITAL MARKETS
1209
investments of these firms. By jointly considering operational and financial decisions, the theory and empirics provide an integrated explanation for patterns of MNC activity and FDI flows that have typically been considered separately. Further consideration of the role of financial frictions on MNC activity along several dimensions may prove fruitful. First, the model presented effectively rules out exports to unrelated parties as a means of serving foreign markets. Incorporating the trade-off between exports and production abroad in a world with financial frictions may yield additional predictions that would help explain the choice between exporting and FDI. Second, exploring the implications of financial frictions for intrafirm trade may help explain how the demands of external funders in weak institutional environments affect the fragmentation of production processes across borders. Finally, the central role of foreign ownership in reducing diversion may lead to significant variation in the relative competitiveness of local and foreign firms that reflects the institutional environment emphasized in this paper. APPENDIX I: THE SHADOW COST OF CASH In the main text, we have treated the shadow value of cash β as exogenous. In this Appendix we briefly illustrate how to endogenize it and show how it relates to characteristics of the Home country and in particular to its level of investor protection. For this purpose, we generalize the setup described in Section II.A and consider the situation in which there are J − 1 Foreign countries, each associated with a level of financial development γ j and a cash flow function R j (x j ).37 The inventor contracts with each of J − 1 foreign entrepreneurs and, as a result of the optimal contracting described above, has an amount of cash equal to W −
˜ j to invest in the Home country. F j= H Preferences and technology at Home are such that the cash flows obtained from the sale of the differentiated good at Home can be expressed as a strictly increasing and concave function of the quantity produced, RH (q H ), satisfying the same properties as the cash flow function in other countries. Home production is managed by the inventor, who can also privately choose to behave or misbehave, with consequences identical to those discussed 37. With some abuse of notation, we use J to denote both the number of countries as well as the set of these countries.
1210
QUARTERLY JOURNAL OF ECONOMICS
above: if the inventor behaves, the project performs with probability pH , but if he misbehaves, the project performs with a lower probability pL. In the latter case, however, the inventor obtains a private benefit equal to a fraction 1 − γ H of cash flows, where γ H is an index of investor protection at Home. The inventor sells domestic cash flow rights to a continuum of external investors at Home, who can obtain a rate of return equal to 1 in an alternative investment opportunity.38 We consider an optimal financial contract between the inventor and external investors in which the inventor is granted the ability to make take-it-or-leave-it offers, just as in the main text. The optimal contract specifies the scale of operation x H , the amount of cash Wx that the inventor invests in the project, the share of equity φ EH sold to external investors, and the amount of cash E H provided by external investors. Taking the contracts signed with foreign individuals as given, an optimal financial contract with external investors at Home that induces the inventor to behave is given by the tuple ˜ x , φ˜ H , E˜ H } that solves the following program: {x˜ H , W E max
x H ,Wx ,φ EH ,E H
(P2)
s.t.
I =
j φ I pH − C j R j (x j )
j= H + pH 1 − φ EH RH (x H ) + W − F˜ j − Wx
x H ≤ E H + Wx Wx ≤ W − F˜ j
j= H
j= H
pH φ EH RH (x H ) ≥ E H ( pH − pL) 1 − φ EH RH (x H ) ≥ (1 − γ H )RH (x H ). It is straightforward to show that provided that γ H is low enough (i.e., provided that financial frictions at Home are large enough), all constraints in program (P2) will bind in equilibrium, and the profits of the entrepreneur can be expressed as ⎛ ⎞ j ⎝ W − φ I pH − C j R j (x j ) + β F˜ j ⎠ , (6) I = j= H
j= H
38. For simplicity, we assume that the inventor cannot pledge foreign cash flow rights to its external investors at Home.
MULTINATIONAL FIRMS AND IMPERFECT CAPITAL MARKETS
1211
where 1−γH ( pH − pL) βˆ = > 1. H 1−γ x˜ H − 1− pH − pL pH RH (x˜ H ) Notice that the resulting profit function (6) is closely related to that considered in program (P1) in Section II.B, where βˆ now replaces β. There are, however, two important differences between the two profit functions. First, the formulation in (6) considers the case in which the inventor obtains cash flow from the exploitation of the technolˆ ogy in multiple countries. Nevertheless, notice that for a given β, the profit function features separability between these different ˆ the optimal contract sources of dividends. As a result, for a given β, with the entrepreneur and external investors in each country j is as described in Section II.B.39 Hence, Propositions 1, 2, and 3 continue to apply and their statements not only apply to changes in the parameter γ but also to cross-sectional (cross-country) variation in investor protection. In this sense, the tests performed in Section IV are well defined. The second important difference between the profit function in (6) and in program (P1) is that the shadow value of cash βˆ is in fact endogenous, in the sense that it is a function of the scale of operation at Home x H , which in turn will depend on the optimal contracts in the other J countries through the date-0 transfers F˜ j for j = H (as is clear from program (P2)). Hence, βˆ will in general be a function of the vector of country investor protections γ ≡ (γ 1 , . . . , γ J−1 , γ H ). Notice, however, that for large enough J, the effect of a particular investor protection level γ j ( j = H) on the overall shadow value of cash βˆ will tend to be negligible, and thus the comparative static results in Section II.B will continue to apply. To sum up, this Appendix has illustrated that a higher-than1 shadow value of cash can easily be rationalized in a simple 39. Notice also that when βˆ > 1, the inventor is financially constrained at Home, in the sense that external investors at Home are only willing to lend to him a multiple of his pledgeable income (wealth plus date-0 payments). If external investors were to lend a larger amount, the inventor’s incentive-compatibility constraint would be violated. The same would of course apply to external investors in foreign countries. This helps rationalize our assumption in Section II.A that the inventor does not sign bilateral financial contracts with external investors in host countries.
1212
QUARTERLY JOURNAL OF ECONOMICS
extension of our initial partial equilibrium model, in which not only foreign entrepreneurs, but also the inventor, face financial constraints. We have seen that endogenizing the shadow value of cash may affect the solution of the optimal contract in subtle ways, but that if the number of host countries in which the inventor exploits his technology is large enough, the comparative static results in Section II.B remain qualitatively valid. APPENDIX II: CHARACTERIZATION OF THE OPTIMAL CONTRACT Let us start by writing the Lagrangian corresponding to program (P1). Letting λk denote the multiplier corresponding to constraint k = 1, 2, 4, 5 (remember constraint (iii) cannot bind), we have L = φ I pH R(x) + (W − F)β − C R(x) + λ1 (E + F − x) + λ2 ( pH φ E R(x) − E) + λ4 (( pH − pL)(1 − φ E − φ I ) C . −(1 − γ )δ(C)) + λ5 φ I − ( pH − pL) The first-order conditions of this program (apart from the standard complementarity slackness conditions) are
(7)
(8)
∂L ∂F ∂L ∂φ I ∂L ∂x ∂L ∂φ E ∂L ∂E ∂L ∂C
= −β + λ1 = 0, = pH R (x˜ ) − λ4 ( pH − pL) + λ5 = 0, = φ˜ I pH R (x˜ ) − C˜ R (x˜ ) − λ1 + λ2 pH φ˜ E R (x˜ ) = 0, = λ2 pH R (x˜ ) − λ4 ( pH − pL) = 0, = λ1 − λ2 = 0, ˜ − = −R (x˜ ) − λ4 (1 − γ ) δ (C)
λ5 = 0. ( pH − pL)
Straightforward manipulation of these conditions delivers λ1 = λ2 = β > 0, pH pH λ2 R (x˜ ) = β R (x˜ ) > 0, λ4 = pH − pL pH − pL λ5 = (β − 1) pH R (x˜ ) > 0,
MULTINATIONAL FIRMS AND IMPERFECT CAPITAL MARKETS
1213
from which we conclude that all constraints bind, as claimed in the main text. The fact that λ5 > 1 immediately implies that constraint (v) ˜ ( pH − pL), as indicated in equation (2). binds and we have φ˜ I = C/ Next, plugging the values of the multipliers in (8) yields ˜ = −δ (C)
βpH − pL , (1 − γ ) βpH
as claimed in equation (3) in the main text. Next, plugging the multipliers and φ˜ I into (7) yields R (x˜ ) =
1 ˜ , ˜ C βpH − pL (1 − γ ) δ(C) − 1− pH − pL pH − pL βpH
pH
which corresponds to equation (4) in the main text. Setting the constraints to equality, we can also compute the total payoff obtained by the inventor: R (x˜ ) ˜ (9) I = Wβ + β − x˜ . R (x˜ ) This expression can be used to analyze when it is optimal for the inventor to implement good behavior. To do so, consider the optimal contract that implements bad behavior. It is clear that in this case the inventor has no incentive to exert monitoring effort. It is also immediate that even when the entrepreneur does not obtain any share of the cash flows, her participation constraint will be satisfied, and thus we have that φ˜ IL + φ˜ EL = 1. The program defining the optimal contract can then be written as max
I = φ I pL R(x) + (W − F ) β
s.t.
x ≤ E+ F pL (1 − φ I ) R(x) ≥ E φ I ≥ 0.
F,φ I ,x,E
(P1 L)
(i) (ii) (iii)
Following the same steps as before, we find that all three constraints will bind, and hence C˜ L = φ˜ IL = 0. Furthermore, the optimal level of investment is given by pL R (x˜ L) = 1,
1214
QUARTERLY JOURNAL OF ECONOMICS
while the overall payoff obtained by the inventor equals R(x˜ L) L ˜ L = βW + β . (10) − x ˜ R (x˜ L) ˜I > ˜ L if and Comparing equations (9) and (10) we see that only if R(x˜ L) R (x˜ ) − x˜ L. − x ˜ > R (x˜ ) R (x˜ L) However, because R(x)/R (x) − x is strictly increasing in x whenever R (x) < 0, we can conclude that good behavior will be implemented whenever x˜ > x˜ L. Note also that x˜ is increasing in γ (this is formally proved in Appendix III), while x˜ L is independent of γ . Furthermore, when γ → 1, it is necessarily the case that x˜ > x˜ L. Hence, there exists a threshold γ over which it is optimal to implement good behavior. APPENDIX III: PROOFS OF COMPARATIVE STATIC RESULTS The comparative statics in Lemma 1 simply follow from the fact that the right-hand side of equation (3) is increasing in γ and β, while the left-hand side is decreasing in C˜ (given the convexity of δ (·)). The statements of Proposition 1 directly follow from Lemma ˜ 1 because φ˜ I is proportional to C. Consider next the comparative statics in Proposition 2. For that purpose, let ˜ ˜ βpH − pL C (1 − γ ) δ(C) ˜ + , F(γ , β, C(γ , β)) = pH − pL pH − pL βpH so that ˜ , β))] = 1. ˜ − F(γ , β, C(γ pH R (x)[1 Using equation (3), we can establish that ˜ ˜ dC˜ 1 dC˜ δ(C) βpH − pL dF (·) (1 − γ ) δ (C) =− + + dγ pH − pL pH − pL dγ pH − pL βpH dγ ˜ δ(C) < 0; =− pH − pL
MULTINATIONAL FIRMS AND IMPERFECT CAPITAL MARKETS
1215
˜ dC˜ βpH − pL 1 dC˜ dF (·) (1 − γ ) δ (C) = + dβ pH − pL dβ pH − pL βpH dβ ˜ pLC˜ pLC = > 0. + ( pH − pL) β 2 pH ( pH − pL) β 2 pH From inspection of (4) and the concavity of R (·), it is then clear that x˜ is increasing in γ and decreasing in β. Finally, the statements in Proposition 3 follow from the discussion in the main text and the fact that R (x˜ ) /x˜ is decreasing in x, ˜ and thus decreasing in γ and increasing in β. APPENDIX IV: GENERALIZATIONS In this Appendix, we provide more details on the generalizations outlined in Section II.C. Consider first the case in which the entrepreneur’s private benefit and the inventor’s private cost of monitoring are proportional to x rather than to R(x). Following the same steps as in the formulation in the main text, we find that the optimal Cˆ and xˆ are now given by ˆ = −δ (C)
βpH − pL βpH (1 − γ )
and pH R (xˆ ) = 1 +
(11)
ˆ pH (1 − γ ) δ(C) (βpH − pL) Cˆ . + β ( pH − pL) pH − pL
Straightforward differentiation indicates that both Cˆ and xˆ continue to be decreasing in γ , as in our paper. Next note that we can use equation (11) to write Cˆ xˆ xˆ R (xˆ ) pH Cˆ Cˆ = ( pH − pL) R (xˆ ) ( pH − pL) R (xˆ ) pH R (xˆ ) −1 ˆ pH xˆ R (xˆ ) 1 pH (1 − γ ) δ(C) (βpH − pL) = . + + β ( pH − pL) ( pH − pL) R (xˆ ) Cˆ ( pH − pL) Cˆ
φˆ I =
It is straightforward to see that the last term continues to be an increasing function of Cˆ and is thus decreasing in γ . This implies that the only way that φˆ I could be increasing in γ would be if ˆ x)— ˆ the elasticity of revenue to output—that is, α(x) ˆ ≡ xˆ R (x)/R( was sufficiently increasing in x. ˆ For a constant-elasticity function, ˆ = α for all xˆ and thus φˆ I continues to be R(x) = Ax α , we have α(x)
1216
QUARTERLY JOURNAL OF ECONOMICS
decreasing in γ for any level of concavity of the R(x) function. Remember that the revenue function will be isoelastic whenever the firm faces a demand with constant price elasticity. If the firm were to face a linear demand function, then α(x) ˆ would be decreasing in x, ˆ hence reinforcing the result that φˆ I is decreasing in γ . We next consider the case in which (local) external investors can also serve a monitoring role. In particular, if external investors exert an unverifiable effort cost MR(x), the private benefit is now B (C, M; γ ) = (1 − γ ) (δ (C ) + μ ( M)) , with μ(·) satisfying the same properties as δ(·) above, namely, ¯ lim M→∞ μ(M) = 0, lim M→0 μ μ (M) < 0, μ (M) > 0, μ(0) = μ, (M) = −∞, and lim M→∞ μ (M) = 0. Because local monitoring is not verifiable, the program that determines the optimal contract will need to include a new incentive compatibility constraint for external investors. In particular, an optimal contract that induces the entrepreneur to behave is now given by the tuple ˆ C, ˆ M} ˆ that solves a program analogous to (P1) but ˆ φˆ I , x, ˆ φˆ E , E, { F, with constraints (ii) and (iv) now given by pH φ E R(x) − MR(x) ≥ E
(ii)
( pH − pL) (1 − φ E − φ I ) ≥ (1 − γ ) (δ (C ) + μ ( M))
(iv)
and with the additional constraint φ E ≥ M/ ( pH − pL)
(vi).
Manipulating the first-order conditions of this new program, we obtain λ5 = (β − 1) pH R (xˆ ) + λ6 , which immediately implies that constraint (v) continues to bind even in the case with local monitoring. Consequently, inventor (or parent firm) equity shares continue to move proportionately with the amount of monitoring undertaken by the inventor. Furthermore, provided that the level of investor protection is sufficiently high, the analysis in the main text goes through essentially unaltered. The reason for this is that in such a case, ˆ being constraint (vi) is not binding (λ6 = 0) and we obtain Cˆ and M determined by (12)
ˆ = βpH − pL , − δ (C) (1 − γ ) βpH
MULTINATIONAL FIRMS AND IMPERFECT CAPITAL MARKETS
1217
which is identical to (3), and
(13)
ˆ = ( pH − pL) . − μ ( M) (1 − γ ) pH
From the convexity of the monitoring functions, we thus obtain ˆ are decreasing functions of γ . Furthermore, that both Cˆ and M manipulating the first-order conditions we can also easily show that (i) the investment levels (and thus) sales revenue continue ˆ x is still increasing in to be increasing in γ , and (ii) the ratio F/ γ , provided that α (xˆ ) does not increase in xˆ too quickly, just as in the main text (details available upon request). ˆ > Note that equations (12) and (13) also imply that −δ (C) ˆ −μ ( M), and if the functions δ (·) and μ (·) are sufficiently simiˆ > C. ˆ Intuitively, a disproportionate amount of lar we will have M local monitoring may be optimal because it is “cheaper,” as external investors have a lower shadow cost of getting remunerated through equity shares. Still, as long as the equilibrium level of ˆ is sufficiently low (or γ is sufficiently high), the above analyM sis suggests that the inventor equity share comoves with investor protections in the same manner as in our simpler model with only inventor monitoring. For low enough values of γ , however, the above optimal conˆ > ( pH − pL) φ E , which violates constraint (vi). In tract leads to M such a case, we have λ6 > 0. Manipulating the first-order condiˆ are implicitly defined by the tions, one can show that Cˆ and M system ˆ 1 + (1 − γ ) μ ( M) = β, ˆ 1 + (1 − γ ) δ (C) ˆ = (1 − γ )(δ(C) ˆ + μ( M)). ˆ pH − pL − Cˆ − M Unfortunately, without imposing particular functional forms for the functions δ (·) and μ (·), it becomes impossible to characterize how Cˆ (and thus φˆ I ) varies with γ . HARVARD UNIVERSITY AND NBER HARVARD BUSINESS SCHOOL AND NBER HARVARD BUSINESS SCHOOL AND NBER
1218
QUARTERLY JOURNAL OF ECONOMICS
REFERENCES Acemoglu, Daron, Simon Johnson, and Todd Mitton, “Determinants of Vertical Integration: Finance Contracts and Regulation,” NBER Working Paper No. 11424, 2005. Aguiar, Mark, and Gita Gopinath, “Fire-Sale FDI and Liquidity Crises,” Review of Economics and Statistics, 87 (2005), 439–452. Albuquerque, Rui, “The Composition of International Capital Flows: Risk Sharing through Foreign Direct Investment,” Journal of International Economics, 61 (2003), 353–383. ` Pol, “Firms, Contracts, and Trade Structure,” Quarterly Journal of EcoAntras, nomics, 118 (2003), 1375–1418. ——, “Incomplete Contracts and the Product Cycle,” American Economic Review, 95 (2005), 1054–1073. ` Pol, and Elhanan Helpman, “Global Sourcing,” Journal of Political EconAntras, omy, 112 (2004), 552–580. ` Pol, Mihir A. Desai, and C. Fritz Foley, “Multinational Firms, FDI Flows Antras, and Imperfect Capital Markets,” NBER Working Paper No. 12855, 2007. Baker, Malcolm, C. Fritz Foley, and Jeffrey Wurgler, “Multinationals as Arbitrageurs? The Effects of Stock Market Valuations on Foreign Direct Investment,” Review of Financial Studies, 22 (2009), 337–369. Barro, Robert J., and Jong-Wha Lee, “International Data on Educational Attainment: Updates and Implications,” CID Working Paper No. 42, 2000. ¨ ¸ -Kunt, and Ross Levine, “A New Database on FiBeck, Thorsten, Asli Demirguc nancial Development and Structure,” World Bank, Policy Research Working Paper No. 2146, 1999. Bertaut, Carol, William L. Griever, and Ralph W. Tryon, “Understanding U.S. Cross-Border Securities Data,” Federal Reserve Bulletin, 92 (2006), A59–A75. Blonigen, Bruce A., “Firm-Specific Assets and the Link Between Exchange Rates and Foreign Direct Investment,” American Economic Review, 87 (1997), 447– 465. Boyd, John H., and Bruce D. Smith, “Capital Market Imperfections, International Credit Markets and Nonconvergence,” Journal of Economic Theory, 73 (1997), 335–364. Brainard, S. Lael, “An Empirical Assessment of the Proximity-Concentration Trade-off between Multinational Sales and Trade,” American Economic Review, 87 (1997), 520–544. Caves, Richard E., Multinational Enterprise and Economic Analysis, 2nd ed. (Cambridge, UK: Cambridge University Press, 1996). Desai, Mihir A., C. Fritz Foley, and Kristin J. Forbes, “Financial Constraints and Growth: Multinational and Local Firm Responses to Currency Depreciations,” Review of Financial Studies, 21 (2008), 2857–2888. Desai, Mihir A., C. Fritz Foley, and James R. Hines, Jr., “Dividend Policy Inside the Firm,” NBER Working Paper No. 8698, 2002. ——, “The Costs of Shared Ownership: Evidence from International Joint Ventures,” Journal of Financial Economics, 73 (2004a), 323–374. ——, “A Multinational Perspective on Capital Structure Choice and Internal Capital Markets,” Journal of Finance, 59 (2004b), 2451–2488. Djankov, Simeon, Caralee McLiesh, and Andrei Shleifer, “Private Credit in 129 Countries,” Journal of Financial Economics, 84 (2007), 299–329. Ethier, Wilfred J., and James R. Markusen, “Multinational Firms, Technology Diffusion and Trade,” Journal of International Economics, 41 (1996), 1–28. Feenstra, Robert, and Gordon Hanson, “Ownership and Control in Outsourcing to China: Estimating the Property-Rights Theory of the Firm,” Quarterly Journal of Economics, 120 (2005), 729–761. Froot, Kenneth A., and Jeremy C. Stein, “Exchange Rates and Foreign Direct Investment: An Imperfect Capital Markets Approach,” Quarterly Journal of Economics, 106 (1991), 1191–1217. Gertler, Mark, and Kenneth S. Rogoff, “North-South Lending and Endogenous Domestic Capital Market Inefficiencies,” Journal of Monetary Economics, 26 (1990), 245–266. Ginarte, Juan Carlos, and Walter Park, “Determinants of Patent Rights: A CrossNational Study,” Research Policy, 26 (1997), 283–301.
MULTINATIONAL FIRMS AND IMPERFECT CAPITAL MARKETS
1219
Grossman, Gene M., and Elhanan Helpman, “Managerial Incentives and the International Organization of Production,” Journal of International Economics, 63 (2004), 237–262. Harris, Milton, and Arthur Raviv, “The Theory of Capital Structure,” Journal of Finance, 46 (1991), 297–355. Helpman, Elhanan, “A Simple Theory of International Trade with Multinational Corporations,” Journal of Political Economy, 92 (1984), 451–471. Helpman, Elhanan, Marc J. Melitz, and Stephen R. Yeaple, “Exports versus FDI with Heterogeneous Firms,” American Economic Review, 94 (2004), 300–316. Holmstrom, Bengt, and Jean Tirole, “Financial Intermediation, Loanable Funds, and the Real Sector,” Quarterly Journal of Economics, 112 (1997), 663–691. King, Robert G., and Ross Levine, “Finance and Growth: Schumpeter Might Be Right,” Quarterly Journal of Economics, 108 (1993), 717–738. Klein, Michael W., Joe Peek, and Eric Rosengren, “Troubled Banks, Impaired Foreign Direct Investment: The Role of Relative Access to Credit,” American Economic Review, 92 (2002), 664–682. Klein, Michael W., and Eric Rosengren, “The Real Exchange Rate and Foreign Direct Investment in the United States,” Journal of International Economics, 36 (1994), 373–389. Kraay, Aart, Norman Loayza, Luis Serv´en, and Jaume Ventura, “Country Portfolios,” Journal of the European Economic Association, 3 (2005), 914–945. La Porta, Rafael, Florencio L´opez-de-Silanes, Andrei Shleifer, and Robert W. Vishny, “Legal Determinants of External Finance,” Journal of Finance, 52 (1997), 1131–1150. ——, “Law and Finance,” Journal of Political Economy, 106 (1998), 1113–1155. Levine, Ross, and Sara Zervos, “Stock Markets, Banks, and Economic Growth,” American Economic Review, 88 (1998), 537–558. Lucas, Robert, “Why Doesn’t Capital Flow from Rich to Poor Countries?” American Economic Review, 80 (1990), 92–96. Marin, Dalia, and Monika Schnitzer, “Global versus Local: The Financing of Foreign Direct Investment,” University of Munich, Working Paper, 2004. Markusen, James R., “Multinationals, Multi-Plant Economies, and the Gains from Trade,” Journal of International Economics, 16 (1984), 205–226. ——, Multinational Firms and the Theory of International Trade (Cambridge, MA: MIT Press, 2002). Markusen, James R., and Anthony J. Venables, “The Theory of Endowment, Intraindustry and Multi-national Trade,” Journal of International Economics, 52 (2000), 209–234. Misawa, Mitsuru, “Tokyo Disneyland: Licensing vs. Joint Venture,” Asia Case Research Centre, University of Hong Kong, Case HKU420, 2005. Rajan, Raghuram G., and Luigi Zingales, “What Do We Know about Capital Structure? Some Evidence from International Data,” Journal of Finance, 50 (1995), 1421–1460. ——, “Financial Dependence and Growth,” American Economic Review, 88 (1998), 559–586. Reinhart, Carmen M., and Kenneth S. Rogoff, “Serial Default and the ‘Paradox’ of Rich to Poor Capital Flows,” American Economic Review, 94 (2004), 52–58. Shatz, Howard J., “The Location of U.S. Multinational Affiliates,” Ph.D. Dissertation, Harvard University, 2000. Shleifer, Andrei, and Daniel Wolfenzon, “Investor Protection and Equity Markets,” Journal of Financial Economics, 66 (2002), 3–27. Tirole, Jean, The Theory of Corporate Finance (Princeton, NJ: Princeton University Press, 2005). Wurgler, Jeffrey, “Financial Markets and the Allocation of Capital,” Journal of Financial Economics, 58 (2000), 187–214. Yeaple, Stephen, “The Role of Skill Endowments in the Structure of U.S. Outward FDI,” Review of Economics and Statistics, 85 (2003), 726–734.
PRICE SETTING DURING LOW AND HIGH INFLATION: EVIDENCE FROM MEXICO∗ ETIENNE GAGNON This paper provides new insight into the relationship between inflation and the setting of individual prices by examining a large data set of Mexican consumer prices covering episodes of both low and high inflation. When the annual rate of inflation is low (below 10%–15%), the frequency of price changes comoves weakly with inflation because movements in the frequency of price decreases and increases partly offset each other. In contrast, the average magnitude of price changes correlates strongly with inflation because it is sensitive to movements in the relative shares of price increases and decreases. When inflation rises beyond 10%–15%, few price decreases are observed and both the frequency and average magnitude are important determinants of inflation. I show that a menu-cost model with idiosyncratic technology shocks predicts the average frequency and magnitude of price changes well over a range of inflation similar to that experienced by Mexico.
I. INTRODUCTION This paper presents new evidence on the setting of consumer prices during low and high inflation and sheds light on the empirical plausibility of competing models of price rigidities. It uses a new store-level data set containing over three million individual price quotes that are representative of more than half of Mexican consumers’ expenditures. The data start in January 1994 and end in June 2002. Over that nine-year period, the rate of increase in the official consumer price index (CPI) rose from 6.8% in 1994 to a peak of 41.8% in 1995, before falling to 4.9% in the last year of the sample.1 Given these considerable fluctuations, this data set allows me to document how individual consumer prices are set at various levels of inflation. It also can be used to discriminate among competing models of nominal price rigidities, as these models’ predictions diverge most in the presence of large shocks. ∗ I would like to thank the members of my dissertation committee, Lawrence J. Christiano, Alexander Monge-Naranjo, Sergio Rebelo, and especially my chairperson Martin Eichenbaum, for their continuous guidance and support. I am also grateful to Martin Bodenstein, Jeff Campbell, Reinout DeBock, Rodrigo Garc´ıa ´ Nicolas Vincent, and three anonymous referees for their insightful comVerdu, ments and suggestions. Chris Ahlin and Jos´e Antonio Murillo Garza offered valuable help with the data, and Martha Carillo, Matthew Denes, and Guthrie Dundas provided excellent research assistance. Financial support for this research was provided in part by the Northwestern University Center for International Economics and Development and the Fonds qu´eb´ecois pour les chercheurs et l’aide a` la recherche (FCAR). The views expressed in this paper are solely the responsibility of the author and should not be interpreted as reflecting the views of the Board of Governors of the Federal Reserve System.
[email protected]. 1. Unless otherwise indicated, all inflation figures are computed using the change in the logarithm of the price index and annualized. C 2009 by the President and Fellows of Harvard College and the Massachusetts Institute of
Technology. The Quarterly Journal of Economics, August 2009
1221
1222
QUARTERLY JOURNAL OF ECONOMICS
40 Austria Belgium Finland France Luxembourg Portugal Spain United States Mexico
)
35
30
25
20
15
10
5
0
1988 1989 1990 1991 1992 1993 1994 1995 1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006
Period covered by country studies
FIGURE I Inflation and Time Coverage of U.S., Euro-Area, and Mexican CPI Studies The studies shown are representative of at least 50% of consumer expenditures. Data on inflation come from the OECD Main Economic Indicators, Banco de M´exico, and the U.S. Bureau of Labor Statistics. The sample period for the United States corresponds to the study of Nakamura and Steinsson (2008). Full references to the euro-area country studies can be found in Dhyne et al. (2005).
My data set captures considerably more variation in inflation than do other studies of consumer prices with comparable product coverage.2 As Figure I indicates, inflation was low and stable in the United States and the euro area relative to Mexico throughout the periods covered by the related studies. For high-inflation economies, the evidence is limited mainly to food products in Israel (Lach and Tsiddon 1992; Eden 2001; Baharad and Eden 2004) and Poland (Konieczny and Skrzypacz 2005) and to supermarket products in Argentina (Burstein, Eichenbaum, and Rebelo 2005). My paper differs from these studies because my data set is representative of a much larger set of goods and services in the CPI. The monthly frequency of price changes varied extensively over my sample period. It rose from an average of 22.1% in 1994 2. For studies on the United States, see Bils and Klenow (2004), Klenow and Kryvtsov (2008), and Nakamura and Steinsson (2008). Dhyne et al. (2005) review the main findings for the euro area.
PRICE SETTING DURING LOW AND HIGH INFLATION
1223
to a high of 61.9% at the peak of inflation in April 1995, before leveling off around 27.4% in the last year of the sample. I find some important differences in price-setting behaviors across low- and high-inflation periods. When inflation is low (below 10%–15%), the frequency of price changes is only mildly correlated with inflation, especially when I restrict the sample to goods, in which case the correlation almost entirely disappears. On the other hand, the average magnitude of price changes in such a low-inflation environment displays a tight and almost linear relationship with the level of inflation. As a result, movements in the frequency of price changes account for little of the inflation variance: at most 11% for the full sample and 6% for the subsample of goods, figures that are similar to that of Klenow and Kryvtsov (2008) for the United States (about 5%). By contrast, when inflation is high (above 10%–15%), both the frequency and average magnitude of price changes are strongly correlated with inflation. Movements in the frequency of price changes then comprise an important component of inflation variance. When I decompose price changes between price increases and decreases, I find that the frequency of price increases rises steadily as inflation rises from 0% to 10%–15%. This rise is partly offset by a simultaneous decline in the frequency of price decreases, thereby dampening movements in the overall frequency of price changes. This offsetting effect stems from goods, which have the largest proportion of price decreases. By comparison, relatively few price decreases are observed among services. As inflation rises from a low level, the decline in the occurrence of price decreases relative to price increases exacerbates movements in the average magnitude of price changes. In my data set, the change in the composition of price changes largely explains the strong correlation between inflation and the average magnitude of price changes when inflation is low. Once inflation moves beyond 10%–15%, price decreases have largely disappeared from most sectors of the economy, with the exception of some fresh produce. The frequency of price increases continues to rise steadily with inflation, however, and the frequency of price changes thus becomes highly correlated with inflation. Overall, my empirical results suggest that pricing models should endogenize the timing of price changes if they wish to make realistic predictions at both low and high inflation levels. They also present the challenge of finding a model offering empirically plausible predictions at all levels of inflation. To investigate
1224
QUARTERLY JOURNAL OF ECONOMICS
whether menu-cost models are consistent with my findings, I calibrate a discrete-time version of the Golosov and Lucas (2007) model. The model features idiosyncratic technology shocks giving rise to a distribution of both positive and negative nominal price adjustments. I show that the model performs well in terms of predicting the average frequency and magnitude of price changes for levels of inflation similar to the ones experienced by Mexico over my sample period. The success of the model comes in part from the presence of offsetting movements in the frequency of price increases and decreases, and highlights the importance of idiosyncratic shocks in this class of models for delivering empirically plausible predictions. The paper is organized as follows. In the next section, I provide a brief overview of the Mexican macroeconomic context over the sample period. In Section III, I describe the assemblage of my data set and discuss features of the data that are important for interpreting my results. Section IV defines the statistics computed in this paper. The main empirical findings are presented in Section V and are then compared to other studies of high-inflation environments in Section VI. In Section VII, I calibrate a discretetime menu-cost model with idiosyncratic technology shocks and investigate its consistency with some key empirical features reported in the paper. The last section provides concluding remarks. II. MACROECONOMIC CONTEXT The sample period was marked by a severe economic downturn in the wake of the December 1994 peso devaluation. To most observers of the Mexican economy, however, 1994 opened rather positively.3 Inflation had been stabilized successfully below 10%, a major achievement in light of the three-digit rates of the late 1980s, and real interest rates also had decreased. The excess return on the three-month, dollar-denominated Tesobonos was only two percentage points above the American T-bill. The budget deficit, seen by many as the culprit of previous economic crises, had been eliminated in 1992. Moreover, the North American Free Trade Agreement had taken effect on January 1, 1994. Foreign capital entered abundantly with a net inflow over 8% of GDP in 1993. However, growth in real GDP per capita remained modest, averaging 2.5% from 1991 to 1993. Many observers saw 3. See Edwards (1998) for a review of observers’ opinions in 1994.
PRICE SETTING DURING LOW AND HIGH INFLATION
1225
(b) Inflation rate 140
100
120
80
100
60 %
%
80 60
40 20
40
0
20 0 1994 1995 1996 1997 1998 1999 2000 2001 2002
1994 1995 1996 1997 1998 1999 2000 2001 2002
(d) Money aggregates (logs, 1994M1=0) 80
200 150
60
M1 M4
%
%
100 40
50 20
0
0 1994 1995 1996 1997 1998 1999 2000 2001 2002
1994 1995 1996 1997 1998 1999 2000 2001 2002
(e) Real output (logs, 1994Q1=0)
(f) Real consumption (logs, 1994Q1=0)
40
30
30
20
%
%
20 10
10 0
1994 1995 1996 1997 1998 1999 2000 2001 2002
0
1994 1995 1996 1997 1998 1999 2000 2001 2002
FIGURE II Main Macroeconomic Indicators Source: Banco de M´exico.
this situation as part of a restructuring process that soon would bring strong growth to the country. The devaluation brought a radical change of mood. On December 22, 1994, the exchange rate collapsed and lost more than ` 40% of its value vis-a-vis the U.S. dollar in the week that followed.4 As depicted in Figure II, short-term interest rates were pushed upward substantially as Banco de M´exico tightened the supply of money to prevent further erosion of the peso and capital flight. 4. Mexico pegged its exchange rate to the dollar in May 1992. In February 1994, the country switched to pre-announced crawling bands around the U.S. dollar.
1226
QUARTERLY JOURNAL OF ECONOMICS
The devaluation left major stagflation in its wake. Inflation took off almost immediately, increasing from 6.4% in November 1994 to 44.3% in January 1995 before peaking at 92.0% in April 1995. Real output per capita contracted 9.5% in 1995, whereas private consumption per capita fell a solid 13.2%. Mexicans would have to wait until 1998 for real GDP per capita to surpass its 1994 level and until 1999 for inflation to settle below 10%. The decline in aggregate income, coupled with a rise in fiscal evasion, brought a sharp decline in government revenues.5 To prevent further revenue erosion, the government raised the general rate of the value-added tax rate from 10% to 15% on April 1, 1995. This change affected all Mexican regions, with the notable exceptions of Baja California and a corridor along the country’s southern and northern borders where the rate remained at 10%. III. MEXICAN MICRO DATA ON CONSUMER PRICES III.A. Description of Sources The data comprise price quotes collected by Banco de M´exico for computing the Mexican CPI. Most price quotes correspond to narrowly defined items sold in specific outlets (e.g., corn flour, brand Maseca, bag of 1 kg, sold in outlet 1100 in Mexico City). A limited number of quotes are citywide indices, or the average prices of small samples of narrowly defined items belonging to the same category and outlet. Since January 1994, the official gazette of the Mexican government, the Diario Oficial de la Federaci´on, has published price quotes every month. This publication releases each quote with a key linking the item to a specific outlet, city, and product category; these keys allow me to track individual prices over time.6 In this paper, I refer to an item’s complete price history as its price trajectory. The raw data set contains a total of 4.7 million price quotes from January 1994 to June 2002. Banco de M´exico is required to make individual prices available to the public up to six months after their publication, but it does not keep a historical data set of individual prices. The data set was assembled by merging the information released in the Diario. The data for the months of January 1994 to February 1995 could not be extracted electronically, 5. See OECD (2000) for a description of the taxation system. 6. Items from the same outlet are attributed store keys independently to ensure confidentiality.
PRICE SETTING DURING LOW AND HIGH INFLATION
1227
so they were typed in from original hard copies of the Diario using double-entry keying, a process ensuring a characterwise accuracy in excess of 99.998%.7 About 430,000 price quotes were added to the database in this way. Precise item descriptions were published in March 1995. The Diario also includes lists of items that are periodically added to or dropped or substituted from the CPI basket. Unlike additions, substitutions are not planned events. They occur when the characteristics of an item (weight, size, model, presentation, etc.) change, when an outlet stops carrying an item, or, in rarer cases, when an outlet goes out of business. The weights used in the CPI are derived from the Survey of Households’ Income and Expenditures (ENIGH). The CPI product categories are representative of all ENIGH categories accounting for at least 0.02% of households’ expenditures. This ensures a coverage of well above 95% of Mexican households’ expenditures. To facilitate comparisons with other studies, I classify each product category according to the euro-area classification of individual consumption by purpose (COICOP). III.B. Sample Coverage In January 1994, the CPI contained 30,692 price quotes spread over 302 product categories. By June 2002, the last month in my sample, it had expanded to nearly 50,000 price quotes distributed over 313 product categories. A major revision of the basket occurred in March 1995 when the number of cities covered in the CPI grew from 35 to 46. At the same time, 29 new product categories were introduced into the basket, and 18 were abandoned. This revision had been planned long before the peso’s devaluation. In July 2002, Banco de M´exico updated the basket again to reflect the structure of Mexican households’ consumption in 2000. I cannot link items before and after the 2002 basket revision because of a change to the item keys. To ensure the greatest comparability across time, I compute my results for a sample covering January 1994 to June 2002 using the expenditure weights implemented in March 1995.8 The sample is further restricted to the product categories comprising individual prices that were unaffected by the 1995 basket revision and I consider only items whose price was 7. I thank Chris Ahlin for lending me original copies of the Diario. 8. These weights are derived from the 1989 ENIGH survey. They were updated using relative prices to reflect consumer expenditures in 1993.
1228
QUARTERLY JOURNAL OF ECONOMICS TABLE I MAIN SAMPLE STATISTICS Period Price quotes Total Average per month Trajectories Substitutions Product categories
January 1994–June 2002 3,209,947 31,470 44,272 10,457 227
CPI coverage (%)
54.1
Sample composition (%) Unprocessed food Processed food Energy Nonenergy industrial goods Services
26.4 21.7 0.4 26.4 25.1
not regulated. In addition, most education services and clothing items were dropped for reasons detailed below. The final sample contains 3.2 million price quotes from over 44,000 price trajectories and covers 54.1% of CPI expenditures. The main groups of products excluded are rents and homeowners’ imputed rents, clothing (except for a few product categories containing individual observations), and education services, whose weights in the CPI are, respectively, 14.0%, 6.0%, and 3.5%. Food items represent just under half of the expenditures in the final sample, a proportion higher than in most U.S. and euro-area studies. Summary statistics are presented in Table I. III.C. Other Aspects of the Data I now address features of the data that are important to consider in interpreting the results. The most significant issue is price averaging. Banco de M´exico collects prices twice monthly for all items but food; food price collection occurs four times per month.9 The collected prices are then averaged to produce the monthly figures reported in the Diario. Observing the monthly average rather than the actual price of an item complicates the inference about price changes. For example, an average price of $2 for an item is 9. In the United States, the BLS collects prices monthly for food consumed at home, energy, and a few additional items with volatile prices. Other prices are collected monthly for the three largest metropolitan areas (New York, Los Angeles, and Chicago) and every other month for the remaining areas.
PRICE SETTING DURING LOW AND HIGH INFLATION
1229
consistent with an actual price of $2 throughout the month. It also is consistent with an actual price of $1.50 in the first half of the month and $2.50 in the second, or any combination of prices with $2 as their average. Moreover, changes to an average-price series are typically more frequent and smaller on average than changes to an actual-price series with the same publication frequency. For example, a price hike from $1.50 to $2.50 in the middle of the month results in an average price of $2, which is $0.50 short of the new actual price, so that another change to the average-price series will likely be recorded in the next month. To make my results as comparable as possible to other studies, which typically do not use averaged price quotes, I have constructed alternative price trajectories that filter out the effect of averaging observations whenever possible. These new series correspond to the end-of-month series, which both are consistent with the published average prices and minimize the number of price changes. Appendix I provides an extensive discussion of how averaging observations affects inferences about the timing and magnitude of price changes, and of how the filter was implemented. I was provided with unpublished semimonthly data by Banco de M´exico, which allows me to directly assess the performance of the filter. Overall, the filtered series are much closer to the end-ofperiod price series that they aim to reproduce. More importantly, the filtered series capture the timing of price changes with great accuracy. All the main patterns described in this paper are found whether prices are filtered or not. Another data issue is that price collectors do not always directly observe prices. Sometimes an item is out of stock, out of season or, in rarer cases, the outlet is closed when the CPI agent visits. In such situations, the price from the previous period is carried forward. Although I cannot identify prices that were imputed in my sample, I do find clear indications that the number of imputations was larger at the beginning of the sample. Item substitutions represented less than 0.1% of all published price quotes in 1994, a proportion that rose to 1.2% in 2001. A more systematic treatment of substitutions was implemented in 2001. Prices can now be carried forward for at most a month and a half before a substitution is sought. If the scarcity is generalized, this allowance can be extended up to three months. This methodological change likely creates a slight downward bias in the estimated frequency of price changes at the beginning of the sample.
1230
QUARTERLY JOURNAL OF ECONOMICS
Prices are inclusive of sales as long as they are conditional on the purchase of a single item. For example, in a three-for-two promotion, the regular price would be reported. In the United States, the Bureau of Labor Statistics reports prices net of sales and promotions whenever possible; the same three-for-two promotion would result in a temporary 33% price decrease. There is no variable in the Mexican data set signaling that an item is on sale or that a promotion is going on. Most price quotes for the product categories of textiles, clothing, shoes, and related accessories are averages of small samples of item prices; all items within a sample pertain to the same outlet whenever possible. Banco de M´exico uses store samples to alleviate the problems associated with rapidly appearing and disappearing items due to changes in fashion and the seasons. All such samples were dropped from my analysis to limit the discussion to individual price changes. The decision to include or exclude store samples has little impact on the main findings. All education services observations, which cover registration, activity, and tuition fees, were also dropped from the sample. These services are typically not available for purchase or not sampled during most months of the year. Prices are mechanically carried forward until the start of the next registration period, semester, or academic year. For this reason, one cannot directly interpret the absence of price changes in the monthly series as evidence of price stickiness. A final issue is that item substitutions often accompany changes in product characteristics, thereby raising the question of whether substitutions should be treated as price changes. The Inflation Persistence Network’s approach is to assume that all item substitutions not previously planned by CPI agencies involve a price change, a choice guided in part by the absence of substitution flags in some of the national databases. In this paper, substitutions were instead excluded from the computation of price changes because their treatment varied over the sample period. The main patterns found in this paper are not affected by this choice. IV. INFLATION ACCOUNTING PRINCIPLES Whenever a price is reported for two consecutive months, I create an indicator that a price change has occurred, 1 if pit = pit−1 Iit = 0 if pit = pit−1 ,
PRICE SETTING DURING LOW AND HIGH INFLATION
1231
where pit is the price of item i (in logs) during month t. Inflation is defined as πt = i∈ϒt ωit pit , where pit = pit − pit−1 , ωit is the sample weight of item i, and ϒt is the set of all items in the sample for which Iit is defined. For ωit , I use the sample share of spending on the product category to which item i belongs, divided by the number of items in that product category for which I can compute a price change at t. Inflation can also be expressed as
πt =
i∈ϒt ωit pit ωit Iit . i∈ϒt i∈ϒt ωit Iit
fr
t
dpt
The term frt , henceforth referred to as the frequency of price changes, is the share of spending in the sample on items whose price changed at month t. The term dpt is the average magnitude of those price changes. In the popular Taylor (1980) and Calvo (1983) models with uniform staggering of price changes, dpt is the only possible source of variation in πt . It is convenient to decompose inflation further into a weighted sum of price increases and decreases: + i∈ϒt ωit Iit pit + ωit Iit πt = + i∈ϒt i∈ϒt ωit Iit
+ frt
dpt+
− i∈ϒt ωit Iit pit − + ωit Iit . − i∈ϒt i∈ϒt ωit Iit
− frt
dpt−
This decomposition is informative about the relationship between inflation and the distribution of price changes. The computation of inflation statistics for special aggregates, such as goods and services, also follows the approach outlined above. My methodology for computing inflation is similar to the approach taken in most euro-area and U.S. studies of individual price changes but differs from that of Banco de M´exico at the time, which computed inflation as the percentage change in a Laspeyres index. Despite differences in sample coverage, methodology, and filtering of price trajectories, the inflation rate in my sample is strongly correlated with the change in the official CPI:
1232
QUARTERLY JOURNAL OF ECONOMICS
The coefficient of correlation is 0.96 over the full sample period and 0.85 over the last three years of the sample.
V. MAIN EMPIRICAL RESULTS This section presents the key empirical findings, focusing on the relationship between inflation and the frequency and average magnitude of price changes. I treat (nonregulated) goods and services separately throughout the discussion due to differences in the way prices are set between the two groups.10 I place a special emphasis on the results for goods, given their predominance in my sample and their greater representativeness. V.A. Setting of Consumer Goods Prices My subsample of goods accounts for 74.9% of all expenditures in my basket and is representative of 77.5% of Mexican consumer expenditures on goods (excluding energy). Most goods left out of the sample pertain to product categories falling under the apparel and related accessories group. Frequency of Goods Price Changes. As seen in the upper panel of Figure III, movements in the frequency of price changes and inflation were very large over the sample period. In April 1995, the rate of inflation in my sample of goods peaked at 86.0% (7.2% in monthly terms). This rate is much higher than the average in 1994 (7.5%) or during the last year of the sample (1.5%). The frequency of price changes also peaked in April 1995, when the price of 64.7% of goods, measured in CPI weights, changed during the month. This number is more than twice the average frequency of 26.8% in 1994. There were large variations in the composition of price changes over the sample period, as shown in the lower panel of Figure III. At the peak of inflation, only 8.9% of price changes were negative, a proportion that rose to 46.0% in the last year of the sample. The corresponding proportion for the full sample of goods and services over the last year of the sample is 43.4%, a figure echoing those from U.S. and euro-area studies. 10. For the products in my sample, the COICOP goods/services classification is almost identical to the Bank of Mexico’s tradables/nontradables classification. The results reported in the paper for goods and services thus have an alternative interpretation in terms of tradables and nontradables.
1233
PRICE SETTING DURING LOW AND HIGH INFLATION (a) Frequency of price changes and inflation
Frequency Inflation
80
%
60 40 20 0 1994
1995
1996
1997
1998
1999
2000
2001
2002
(b) Frequency of price increases and decreases 60 Increases Decreases
50
%
40 30 20 10 0 1994
1995
1996
1997
1998
1999
2000
2001
2002
FIGURE III Monthly Frequency of Price Changes (Nonregulated Goods) All statistics in the figure, including inflation, are computed using the sample of nonregulated goods.
Positive comovement between frt and πt is clearly visible in Figure III. The correlation coefficient between the two series is 0.91 for the whole period.11 This correlation is largely driven by the high-inflation episode, however; it is about zero if I consider only the last three years of the sample. After mid-1996, it is difficult to spot any downward drift in the frequency of price changes, even though inflation trends down. The reason behind this loose relationship is apparent in the lower panel of Figure III, where I − break down frt into fr+ t and frt . As inflation declined, so did the frequency of price increases. At the same time, price decreases became more frequent, thereby dampening movements in the overall frequency of price changes. A look at the correlation between 11. All correlation statistics presented in this section are computed using linearly detrended series.
1234
QUARTERLY JOURNAL OF ECONOMICS
− fr+ t , frt , and πt provides further evidence of these offsetting movements. In the last three years of the sample, the correlation is − 0.59 between fr+ t and πt , and −0.74 between frt and πt . The net result is a relative absence of correlation between frt and πt for my sample of goods over that period. There are a few apparent large negative movements in the inflation series of goods over the low-inflation period, in particular in March 1999, February 2001, July 2001, and February 2002, which are associated with unusually large changes in fresh produce prices. Shocks to the supply of fruits and vegetables, such as unusual weather conditions, can have a notable impact on the price of these items because they are perishable in nature. Some evidence of opposite movements in the frequency of price increases and decreases is apparent for these months. The scatterplot in the upper left panel of Figure IV offers a view from a different angle of the relationship between the monthly frequency of price changes and inflation. Similar scatterplots for price increases and decreases are shown in the middle left and lower left panels, respectively. All panels display linear regression lines that use linear, quadratic, and cubic goods inflation terms as explanatory variables, as well as a full set of year dummies. The dummies are included to account for potential shifts in the relationships over time that are unrelated to inflation, such as fluctuation in aggregate demand, basket composition, and methodology. I present regression lines for two sets of observations. The dashed lines include all monthly observations in the sample. The solid lines exclude April 1995, which was marked by a 5-percentage-point increase in the value-added tax, as well as all periods with negative inflation, which effectively removes all large shocks to food produce mentioned earlier. Variations in the supply of fresh fruits and vegetables and value-added tax changes are shocks that differ in nature from a general rise in the price level. For this reason, my discussion of the scatterplots focuses on the regression results for the smaller sample, as they likely capture the overall relationship between inflation and its components better. All regression statistics can be found in Table II. When inflation is zero, each percentage-point increase in the rate of nonregulated-goods inflation is associated with a 0.35 (0.13)-percentage-point rise in the frequency of price increases and an opposite 0.22 (0.06)-percentage-point decline in the frequency
PRICE SETTING DURING LOW AND HIGH INFLATION (a) Frequency of price changes
1235
(b) Magnitude of price changes
70
15
10
50
Magnitude (%)
Frequency (%)
60
40 30 20
0
0
Data All observations Excluding π<0 and VAT change
10 0
20 40 Inflation (%)
60
5
80
0
(c) Frequency of price increases
20 40 Inflation (%)
60
80
(d) Magnitude of price increases
70
20
15
50
Magnitude (%)
Frequency (%)
60
40 30 20
10
5
10 0
0
20 40 Inflation (%)
60
0
80
(f) Frequency of price decreases
60
80
20
20
15 Magnitude (%)
Frequency (%)
20 40 Inflation (%)
(f) Magnitude of price decreases
25
15 10
10
5
5 0
0
0
20 40 Inflation (%)
60
80
0
0
20 40 Inflation (%)
60
80
FIGURE IV Scatterplot of the Monthly Frequency and Average Magnitude of Price Changes and Inflation (Nonregulated Goods) Each panel contains a scatter plot of the annualized monthly inflation rate, on the x-axis, and the associated monthly frequency or average magnitude statistics, on the y-axis. All statistics were computed using all nonregulated goods in the sample. The frequency and average magnitude were regressed on linear, quadratic, and cubic inflation terms, as well as a full set of year dummies. The dashed lines show the relationships predicted using all monthly observations in the regressions, conditional on the mean year dummy, and the solid lines show the same relationships when observations associated with negative monthly inflation outcomes and the April 1995 value-added tax change are excluded.
0.012 (0.010) 0.050 (0.009) 0.050 (0.007) 0.056 (0.010) 0.047 (0.007) 0.054 (0.008) 0.076 (0.006) 0.085 (0.006) .92
0.252 (0.005) 0.128 (0.040) 0.983 (0.162) −0.739 (0.163)
0.007 (0.011) 0.041 (0.009) 0.047 (0.007) 0.050 (0.009) 0.046 (0.007) 0.054 (0.008) 0.078 (0.007) 0.082 (0.005) .90
0.250 (0.008) 0.136 (0.126) 1.293 (0.546) −1.310 (0.589)
Restricted
fr
0.013 (0.010) 0.049 (0.008) 0.044 (0.006) 0.051 (0.008) 0.039 (0.008) 0.028 (0.005) 0.033 (0.006) 0.029 (0.005) .95
0.147 (0.004) 0.265 (0.047) 0.888 (0.164) −0.740 (0.162)
All
0.005 (0.010) 0.038 (0.008) 0.041 (0.006) 0.043 (0.008) 0.035 (0.007) 0.030 (0.005) 0.036 (0.007) 0.027 (0.004) .95
0.140 (0.007) 0.354 (0.132) 0.943 (0.568) −1.083 (0.616)
Restricted
fr+
0.000 (0.006) 0.000 (0.003) 0.005 (0.003) 0.005 (0.003) 0.009 (0.004) 0.026 (0.005) 0.043 (0.003) 0.056 (0.003) .92
0.106 (0.002) −0.137 (0.020) 0.095 (0.066) 0.001 (0.059)
All
0.002 (0.006) 0.003 (0.004) 0.007 (0.003) 0.007 (0.004) 0.011 (0.003) 0.025 (0.005) 0.041 (0.003) 0.055 (0.004) .90
0.110 (0.004) −0.218 (0.061) 0.350 (0.256) −0.227 (0.274)
Restricted
fr−
0.003 (0.002) −0.001 (0.001) −0.002 (0.001) −0.002 (0.001) −0.004 (0.001) −0.004 (0.001) −0.005 (0.001) −0.004 (0.001) .99
0.006 (0.001) 0.242 (0.010) −0.200 (0.034) 0.065 (0.030)
All
0.003 (0.002) −0.002 (0.001) −0.003 (0.000) −0.003 (0.001) −0.003 (0.000) −0.003 (0.000) −0.004 (0.000) −0.004 (0.000) .99
0.003 (0.001) 0.301 (0.015) −0.419 (0.078) 0.280 (0.088)
Restricted
dp
−0.006 (0.004) −0.010 (0.003) −0.011 (0.003) −0.008 (0.003) −0.008 (0.002) 0.001 (0.003) 0.011 (0.003) 0.016 (0.002) .79
0.082 (0.002) 0.075 (0.011) 0.011 (0.035) −0.028 (0.033)
All
−0.007 (0.004) −0.011 (0.003) −0.012 (0.003) −0.009 (0.003) −0.008 (0.002) 0.001 (0.003) 0.011 (0.003) 0.018 (0.002) .77
0.079 (0.003) 0.127 (0.048) −0.196 (0.185) 0.189 (0.195)
Restricted
dp+
−0.014 (0.006) 0.013 (0.006) 0.004 (0.003) 0.015 (0.005) 0.010 (0.009) 0.002 (0.003) 0.006 (0.003) −0.001 (0.004) .48
0.106 (0.004) −0.226 (0.047) 0.645 (0.116) −0.462 (0.086)
All
−0.019 (0.005) 0.008 (0.005) 0.001 (0.003) 0.011 (0.005) 0.005 (0.008) 0.004 (0.003) 0.010 (0.003) 0.003 (0.004) .31
0.098 (0.005) −0.081 (0.094) 0.195 (0.361) −0.063 (0.398)
Restricted
dp−
Notes: The numbers in parenthesis are standard errors based on the Huber–White estimator of variance. The restricted sample (labeled “restricted”) excludes monthly observations associated with negative inflation outcomes and the value-added tax change of April 1995. All inflation statistics displayed are computed using the sample of nonregulated goods.
R2
2002
2001
2000
1999
1998
1997
1996
Year dummies 1995
π3
π2
π
Constant
All
TABLE II LINEAR REGRESSION RESULTS (NONREGULATED GOODS)
1236 QUARTERLY JOURNAL OF ECONOMICS
PRICE SETTING DURING LOW AND HIGH INFLATION
1237
of price decreases.12 These opposite movements have dampening effects on the frequency of price changes, whose corresponding slope is 0.14 (0.13). As inflation increases from a low level, the frequency of price increases becomes more responsive to changes in inflation, whereas the frequency of price decreases becomes less so, resulting in greater sensitivity of the frequency of price changes to inflation. At an inflation rate of 15%, a 1% change in inflation is associated with a 0.56 (0.04)-percentage-point rise in the frequency of price increases and a 0.13 (0.01)-percentage-point decline in the frequency of price decreases. As inflation increases further, few price decreases are observed in the economy; the rise in the frequency of price changes is then mainly driven by the steady growth in the occurrence of price increases. At all levels of inflation, I find that the response of the frequency of price increases to a change in inflation is larger than that of price decreases. A similar asymmetry is found in U.S. data, as reported by Nakamura and Steinsson (2008). The year dummies appear to capture some key changes in methodology and the economic environment over time. In particular, they are lowest at the beginning of the sample, when maintaining a fixed basket was seen as important, and highest for 2001 and 2002, which had systematic substitutions of unavailable items to keep the basket up to date. No major change in methodology occurs over the 1996 to 2000 period and I cannot reject the hypothesis that the year dummies for 1996 to 2000 are jointly identical at the 10% confidence level. Imposing such equalities results in slightly more sensitive responses of the frequency of price increases and decreases at low levels of inflation, but the overall sensitivity of the frequency of goods price changes to inflation is largely unchanged. Interestingly, the year dummies for price increases and decreases have a tendency to rise over time. It is thus possible that factors not directly related to inflation, such as innovations in the technology used by outlets to change prices or changes in the composition of stores, moderated the fall in the frequency of price changes as inflation declined in the latter years of the sample.
12. Standard deviations are shown in parentheses. They were computed using the Huber–White estimator of variance. As a check, I also computed standard errors using the autocorrelation-robust Newey–West estimator for the entire sample period (consecutive observations are required) with negligible impact on the estimates. Moreover, the fit of the linear model is virtually identical to that obtained using the nonlinear estimator of Papke and Wooldrige (2006), which directly accounts for the zero–one bounds on the frequency.
1238
QUARTERLY JOURNAL OF ECONOMICS
Average Magnitude of Goods Price Changes. The average magnitude of goods price changes comoves strongly with goods inflation, regardless of whether the latter is low or high. As shown in the upper panel of Figure V, dpt and πt follow similar patterns over the sample period.13 Both series registered sharp increases during the Mexican peso crisis, followed by a protracted decline and ultimately a stabilization. The correlation between the two series is 0.95 over the full sample period. The high-inflation episode does not drive this strong correlation, as was the case with the frequency of price changes; indeed, the correlation actually rises over the last three years of the sample. As the upper right panel of Figure IV indicates, dpt and πt have a tight, almost linear relationship when inflation is low. When inflation is elevated, this relationship is still strong and positive, although a bit noisier and somewhat concave. The figure also displays linear regression lines computed using the same set of observations and regressors employed for the frequency of price changes. The corresponding regression statistics are presented in Table II. The average sizes of price increases and decreases are much less sensitive to the level of inflation than dpt . Except for a short period around the peak of inflation, the two series show relatively small oscillations around their respective sample means of 9.0% for price increases and 9.8% for price decreases.14 In the case of price decreases, I cannot reject the hypothesis that the coefficients associated with the three inflation terms in the regression are jointly equal to zero. The middle right panel is consistent with a mild rise in the size of price increases as inflation moves from a low to a high level. One cannot exclude, however, the possibility that this positive relationship partly reflects a rise in the occurrence of multiple price increases during the month. When this is the case, dpt+ overstates the size of individual price increases. The finding of a tight relationship between the average magnitude of price changes and inflation should come as no surprise, given the behavior of the frequency of price changes documented earlier. By definition, πt = frt · dpt . When inflation is low, frt moves little with inflation, implying that dpt moves strongly and almost linearly with πt . By contrast, when inflation is high, frt comoves strongly and positively with πt . This second source of variation in 13. The inflation series displayed is the nonannualized monthly inflation rate to facilitate visual comparisons. 14. The few large spikes in dpt− correspond to large variations in the price of some fresh produce.
PRICE SETTING DURING LOW AND HIGH INFLATION
1239
(a) Magnitude of price changes vs. inflation 15 Average change Monthly inflation
%
10 5 0
1994
1995
1996
1997
1998
1999
2000
2001
2002
(b) Average magnitude of increases and decreases 20 Increases Decreases
%
15 10 5 0 1994
1995
1996
1997
1998
1999
2000
2001
2002
(c) Predicted average change 15 Actual Fixed magnitude Fixed share
%
10 5 0
1994
1995
1996
1997
1998
1999
2000
2001
2002
FIGURE V Average Magnitude of Price Changes (Nonregulated Goods) All statistics in the figure, including inflation, are computed using the sample of nonregulated goods. The monthly rate of inflation is not annualized.
1240
QUARTERLY JOURNAL OF ECONOMICS
πt introduces some curvature in the relationship between πt and dpt . To better understand what drives dpt , it is convenient to express it as dpt = st · |dpt+ | − (1 − st ) · |dpt− |, + − where st = fr+ t /(frt + frt ) is the share of price increases among price changes. As this equation makes clear, fluctuations in dpt can originate from two sources: changes in the relative occurrence of price increases and decreases (the composition effect) and variations in their respective sizes. To assess the importance of each margin, I compute two counterfactual series, which are displayed in the lower panel of Figure V. I obtain the first by holding st at its sample mean to show how movements in |dpt+ | and |dpt− | alone affect dpt . In the second series, |dpt+ | and |dpt− | are held at their respective sample means so that the relative occurrence of price increases and decreases is the only source of variation in dpt . The exercise indicates that the composition effect drives almost entirely dpt when inflation is below 10%–15%. Had st been constant, dpt would have shown a counterfactual gentle rise in the last three years of the sample because of a mild upward trend in dpt+ after 1999. By contrast, the series allowing only for the composition effect predicts the level of dpt remarkably well over that period. When inflation nears its peak, the composition effect alone is insufficient to match the level of dpt closely, but it is a better predictor than merely allowing for changes in the average absolute magnitude.
V.B. Setting of Consumer Services Prices Services represent a smaller share of expenditures (25.1%) in my basket than in the entire CPI (41.4%). This difference primarily reflects the exclusion of rents, which accounts for one-third of Mexican spending on services, for which individual data are not available. It also stems from my decision to exclude all items that are not available for purchase every month of the year (education services), or whose price is regulated (such as taxi and public transportation), in order to focus on market prices that are free to respond to changes in the economic environment. Overall, my basket is representative of 32.8% of consumer expenditures on
PRICE SETTING DURING LOW AND HIGH INFLATION
1241
(a) Frequency of price changes and inflation 70 Frequency Inflation
60
%
50 40 30 20 10 0 1994
1995
1996
1997
1998
1999
2000
2001
2002
(b) Frequency of price increases and decreases 70 Increases Decreases
60
%
50 40 30 20 10 0 1994
1995
1996
1997
1998
1999
2000
2001
2002
FIGURE VI Monthly Frequency of Price Changes (Nonregulated Services) All statistics in the figure, including inflation, are computed using the sample of nonregulated services.
services, compared to about half for the sample used by Bils and Klenow (2004) for the United States.15 The upper panel of Figure VI displays the frequency of price changes and inflation in the subsample of services over my sample period. As was the case with goods, services inflation peaked in April 1995, reaching 65.7% (5.5% in monthly terms), whereas the corresponding frequency of price changes rose to a sample high of 53.7%. However, there are several notable differences between the setting of goods and services prices. First, price changes are much less frequent among services than among goods at all levels 15. I estimated this proportion based on the 1993–1995 CPI weights for all urban consumers. For consistency with the COICOP methodology used throughout this paper, I excluded energy categories and classified food consumed away from home under “services” (items such as restaurant meals are categorized as “goods” under the BLS methodology, but as “services” under the COICOP).
1242
QUARTERLY JOURNAL OF ECONOMICS
of inflation. Even in 1995, as services inflation averaged 29.7%, the frequency of services price changes was lower (21.6%) than that of goods over the last year of the sample (33.6%), when goods inflation averaged only 1.5%. Second, services price changes are much less uniformly distributed over the year than are those for goods; nominal adjustments tend to cluster in the first quarter of each year. Another strong seasonal pattern would also be apparent in August and September if education services were added to the sample. Third, the frequency of price changes is the key margin driving the adjustment in services inflation, as hinted by the strong correlation between the two series. A role for movements in the average magnitude of price changes (whose time series is not shown) can be seen by noting the divergence between inflation and frequency series around the peak of inflation and at the beginning of some years, when price changes tend to be relatively large. Fourth, as shown in the lower panel of Figure VI, service price decreases are much less frequent than price increases, especially when inflation is high. In 1995, as inflation was rampant, a meager 1.5% of services price changes were negative, compared to 14.8% for goods. Over the last year of the sample, about 15.3% of services price changes were negative, compared to 46.1% for goods. Finally, services prices exhibited substantially more inertia than goods prices over the sample period. In the year prior to the Mexican peso crisis, the average rates of goods and services inflation were similar, at 7.6% and 6.6%, respectively. In 1995, the goods price index rose 15.4 percentage points more than that of services. By the turn of 1997, the ratio of services to goods prices had fallen 22.2 percentage points relative to its average in 1994, and would not return to its precrisis level before early 2002. Even in the last year of the sample, services inflation was running substantially higher than goods inflation, at 7.6% and 1.5% on average, respectively, suggesting that the inflationary consequences of the Mexican peso crisis had yet to be fully passed through to services prices. V.C. Inflation Variance Decomposition To gauge the relative importance of movements in the frequency and magnitude of price changes for the variance of inflation, Klenow and Kryvtsov (2008) proposed the following decomposition: 2
2
var(πt ) = fr · var(dpt ) + dp · var(frt ) + 2fr · dp · cov(dpt , frt ) + Ot2 ,
Intensive margin
Extensive margin
PRICE SETTING DURING LOW AND HIGH INFLATION
1243
TABLE III INFLATION VARIANCE DECOMPOSITIONS Inflation
Intensive margin share of inflation Mean (%) Std. dev. (%) Auto corr. variance (%) Full sample period (January 1994–June 2002) Full sample 14.4 14.2 0.81 Nonregulated goods 14.3 16.1 0.81 Nonregulated services 14.5 10.0 0.69 Full sample Nonregulated goods Nonregulated services Full sample Nonregulated goods Nonregulated services
January 1995–June 1999 21.7 15.3 0.82 22.6 17.2 0.84 19.1 11.0 0.66 July 1999–June 2002 5.0 4.4 3.5 5.6 9.2 4.1
41.4 48.3 10.6 34.7 41.8 9.9
0.06 0.08 0.39
89.2 93.9 18.0
where Ot2 are high-order terms that are functions of frt . If price changes are perfectly staggered, as in the baseline Calvo or Taylor model, then the intensive margin accounts for all of the variance of inflation. Using monthly U.S. CPI data from 1988 to 2004, Klenow and Kryvtsov (2008) find that the intensive margin accounts for about 95% of the inflation variance, whereas the extensive margin terms, collectively or individually, are small. As shown in Table III, the Mexican data also point to a minor role for movements in the frequency of price changes when restricted to the low-inflation period after mid-1999. The intensive margin’s share of inflation variance is 89.2% over that period for the full sample, a proportion that reaches 93.9% among goods. Over the entire sample period, however, the intensive margin’s share is only 41.4% of the inflation variance, and falls further to 34.7% when limited to the period January 1995 to June 1999. This finding clearly indicates that fluctuations in frt played an important role in the dynamics of inflation over the full sample period, and especially when inflation was high and volatile. Alternatively, inflation can be decomposed as πt = πt+ + πt− , − + − − where πt+ = fr+ t · dpt (πt = frt · dpt ) is the inflation contribution of items whose price rose (fell) over the month. Following Klenow and Kryvtsov (2008), the variance of inflation can then be expressed as var(πt ) = var(πt+ ) + cov(πt+ , πt− ) + var(πt− ) + cov(πt+ , πt− ).
pos
neg
1244
QUARTERLY JOURNAL OF ECONOMICS
Over the full sample period, I find that pos/var(πt ) = 0.82, a clear indication that most of the variance of inflation can be traced back to movements in the inflation contribution of price increases. When restricted to the last three years of the sample, a period of relatively low inflation, I find that pos/var(πt ) = 0.32, a value noticeably lower than that reported by Klenow and Kryvtsov (2008) for the United States (0.65). The difference seems attributable to the exceptionally large downward movements in the price of fresh produce at the beginning of 2001 and 2002.16 VI. INTERNATIONAL COMPARISONS My findings for the low-inflation portion of my sample are broadly consistent with the results reported in U.S. and euro-area studies.17 Evidence on the setting of consumer prices under high inflation is more limited, however. Table IV lists the main empirical studies in high-inflation environments and shows, for each one, the composition of the basket, the average inflation rate, and the mean frequency of price changes. In comparison to my Mexican data set, the samples from these studies are relatively small and predominantly composed of food items. My sample represents a significant broadening of the sample of Mexican food prices CPI used by Ahlin and Shintani (2007) in their analysis of price dispersion. Moreover, the sample periods from previous studies are typically restricted to a few consecutive years, which limits the performance of time series analysis. The first study of individual consumer price setting in a highinflation context was done by Lach and Tsiddon (1992). They considered a sample of 26 food products from the Israeli CPI (mainly meat and alcohol) during two time periods: 1978–1979 and 1981– 1982. For the former period, they found that 46.5% of prices changed every month, whereas inflation averaged 77%.18 The frequency of price changes rose to 60.4% in 1981–1982 as inflation reached an impressive 116%.19 Their results clearly indicate that 16. To verify this hypothesis, I computed pos/var(πt ) based on a sample of Mexican CPI prices starting in July 2002 and ending in March 2007. The average inflation rate over that period (3.9%) is similar to the July 1999 to June 2002 period (5.0%), but fresh produce prices display few exceptionally large movements. The corresponding value of pos/var(πt ) for this sample is 0.60, a figure similar to that of Klenow and Kryvtsov (2008). 17. See Dhyne et al. (2006) for a review of the main U.S. and euro-area findings. 18. To facilitate comparisons, all inflation figures in this section are computed in the standard way rather than using logarithmic differences. 19. The figures for the samples considered by Lach and Tsiddon (1992) are taken from Eden (2001).
Up to 2,400
31,470
573
2,802
254
1994–1995 (1994) (1995) 1994–2002 (1995) (1996) (1997) (1999) (2001) 1990–1996 (1990) (1992) (1994) (1996)
1991–1992
1991–1992
1981–1982
1978–1979
258 278
Mar. to Dec. 2002
Sample periodb
563
Observations per montha
— 7.1 52.0 — 52.0 27.7 15.7 12.3 4.4 — 249.3 44.3 29.5 18.5
13.6
13.6
116.0
77.0
39.7
— 49.3 66.0 — 39.2 32.2 28.3 27.5 27.3 — 59 39 32 30
24
34.6
60.4
46.5
54.5
Inflation Mean monthly (%, a.r.)c frequency (%)
a Author’s calculations for Poland based on Konieczny and Skrzypacz (2005). The figures for the Lach and Tsiddon samples are taken from Eden (2001). b Selected subsample periods are shown in parentheses. c Author’s calculations based on the change in the official CPI for Argentina, Israel, and Mexico. The figures are not in logarithmic changes, as in the remainder of the paper.
Konieczny and Skrzypacz 52 goods, including 37 grocery (2005) items, and 3 services
Poland
227 product categories, representing 54.1 percent of Mexican consumption expenditures
Gagnon (this study)
Mexico
Mexico
Israel
Israel
Israel
Sample product coverage
58 goods sold in 8 supermarkets in Buenos Aires and 10 services 26 food products (mostly meat and alcoholic beverages) Lach and Tsiddon (1992) 26 food products (mostly meat and alcoholic beverages) Eden (2001) 23 food products (mostly meat and alcoholic beverages) Eden (2001), Baharad Up to 390 narrowly defined and Eden (2004) products from the Israeli CPI Ahlin and Shintani (2007) 44 food products sold in Mexico City
Authors
Argentina Burstein, Eichenbaum, and Rebelo (2005) Israel Lach and Tsiddon (1992)
Country
TABLE IV COMPARISON OF HIGH-INFLATION STUDIES
PRICE SETTING DURING LOW AND HIGH INFLATION
1245
1246
QUARTERLY JOURNAL OF ECONOMICS
the frequency of food price changes can be responsive to the rate of inflation. Konieczny and Skrzypacz (2005) study the transition from a planned to a market economy in Poland. At the peak of inflation in 1990, the price of 59% of items in their basket, composed mainly of food products, changed every month, a proportion that halved as inflation fell to just under 20% in 1996. Like mine, their findings are consistent with a nonlinear relationship between frequency and inflation; the frequency fell about 0.4 percentage point for each percentage point decline in inflation from 1991 to 1993, but then only 0.25 percentage point from 1993 to 1996. One must be very careful when making cross-country comparisons, even when inflation rates are similar, because of large variations in basket composition and methodology. For example, the frequency of price changes reported by Ahlin and Shintani (2007) for their sample of food prices in Mexico in 1995 exceeds by 27 percentage points the one I find in my broader sample of goods and services. Similarly, Burstein, Eichenbaum, and Rebelo (2005) report a frequency of price changes that is similar to that reported by Konieczny and Skrzypacz (2005) for Poland in 1990 (54.5% versus 59%, respectively), even though the rate of inflation in Poland was over six times that of Argentina.20 The studies of Eden (2001) and Baharad and Eden (2004) are possibly the closest in spirit to mine, as they make comparisons between low- and high-inflation periods while controlling for basket composition. Using data from the Israeli CPI for 1991 and 1992, a period of relatively low inflation, they construct a basket matching up to 23 of the 26 food products in the Lach and Tsiddon (1992) study, to which they compare their findings. Even though the product coverage of their sample is much smaller than mine for Mexico, the findings are very similar. In particular, the frequency of price changes at the peak of inflation (60.4%) is nearly double that for the relatively low inflation sample (34.6%), indicating a major role for the frequency of price changes in the adjustment to a higher inflation rate. VII. EMPIRICAL PERFORMANCE OF MENU-COST MODELS In this section, I investigate whether a menu-cost model with idiosyncratic technology shocks can correctly predict the average 20. This relatively higher frequency of price changes in Argentina may be related to the type of establishments surveyed (supermarkets). Baudry et al. (2007) report that the outlet size is positively correlated with the frequency of price changes in French CPI data.
PRICE SETTING DURING LOW AND HIGH INFLATION
1247
magnitude and frequency of price changes at levels of inflation similar to the ones observed in Mexico over my sample period. This particular model was chosen because it has several desirable features. First, the menu costs produce infrequent, lumpy nominal price adjustments. Second, the model leaves the frequency of price changes free to vary with inflation, a feature not found in most time-dependent models or in state-dependent models in which nominal prices adjust every period. Third, the presence of idiosyncratic technology shocks ensures that individual price changes will be observed even when aggregate inflation is near or at zero. Moreover, these shocks give rise to both positive and negative price adjustments, which might help the model to generate empirically plausible offsetting movements in the frequency of price increases and decreases. VII.A. Economic Environment The economic environment is very similar to the models of Danziger (1999) and Golosov and Lucas (2007). The economy consists of three types of agents. An infinitely lived representative household supplies labor and consumes a basket of differentiated consumption items. These items are produced by a continuum of monopolistically competitive firms subject to idiosyncratic technology shocks. Finally, there is a monetary authority that exogenously sets the rate of money growth, gt , at each period t. For simplicity, I assume that gt follows a Markov switching process. Households. The problem of the representative household is to choose a sequence for consumption, {Ct }, and for hours worked, {Nt }, in order to maximize its present discounted utility, max
{Ct ,Nt }
E0
∞
β t (log Ct − ψ Nt ) ,
t=0
subject to a budget constraint, Pt Ct = Wt Nt + Pt t , and a simple money demand, Pt Ct = Mt . The variable Pt is the price index, Wt is the wage rate, and Mt is the household’s holding of money. Real profits, t , are expressed in units of the consumption basket and remitted every period by intermediate firms. The budget constraint states that consumption spending equals the sum of a household’s labor income and profits received from firms. Following Golosov and Lucas (2007), I assume that utility is separable, logarithmic in consumption, and linear in labor. Under these assumptions, the wage rate is proportional to the stock of money,
1248
QUARTERLY JOURNAL OF ECONOMICS
ψ = Wt /Pt Ct = Wt /Mt . Consumption, Ct = ( (c j,t )(θ−1)/θ dj)θ/(θ−1) , is a composite of individual consumption items aggregated using a Dixit–Stiglitz specification. The resulting demand for indi−θ vidual items must satisfy c j,t = t ) Ct . The effective price
( p j,t /P 1−θ index in this economy is Pt = ( ( p j,t ) dj)1/(1−θ) . To ensure comparability with my empirical results, all statistics of the model, including inflation, are computed using the methodology outlined in Section IV. Intermediate Firms. There is a continuum of measure one of monopolistically competitive firms. At the beginning of each period, each firm independently draws an idiosyncratic productivity shock and observes the monetary injection and aggregate objects. It then decides whether to keep selling its item at the same nominal price as in the previous period, or to incur a menu cost (expressed in units of labor), ξ , in order to reoptimize its price. The firm must satisfy the demand once a price is set for the period. The production function of the jth firm is linear in labor, y j,t = φ j,t n j,t . I assume that labor productivity, φ j,t , evolves according to log φ j,t = (1 − ρ) log φ¯ + ρ log φ j,t + ε j,t , where technological innovations, ε j,t , are drawn from a normal distribution N(0, σε2 ). Firms maximize the present discounted value of their real profits. It is convenient to express the problem of the firm recursively in order to solve it using dynamic programming techniques. To ensure stationarity, all nominal variables are scaled by the money stock. Let V (φ, p, μ, g) be the Bellman equation of an optimally behaving firm just before it decides whether to change or retain its nominal price from the previous period. The state of the firm comprises its price, p, its technological level, φ, the joint distribution of individual prices and technologies in the economy, μ = × P, and the current rate of money growth, g. The value function is given by V (φ, p, μ, g) = max {Vnc (φ, p, μ, g) , Vc (φ, μ, g)}, where Vnc (φ, p, μ, g) is the value function associated with the firm’s decision to make no change to its nominal price in the current period and behave optimally thereafter, and Vc (φ, μ, g) is the corresponding value function of a firm changing its price in the current period. These functions are expressed as Vnc (φ, p, μ, g) = π (φ, p, μ, g) + β q(μ, g, μ (μ, g ), g ) (1)
× V (φ , p − g , μ (μ, g ), g )dλ(φ , g | φ, g)
PRICE SETTING DURING LOW AND HIGH INFLATION
and (2)
Vc (φ, μ, g) = max p˜
Vnc (φ, p, ˜ μ, g) − ξ
1249
W (μ, g) , P (μ, g)
respectively. The period real gross profit function is expressed as −θ p p 1 W (μ, g) π (φ, p, μ, g) = C (μ, g) . − P (μ, g) φ P (μ, g) P (μ, g) The integral on the right-hand side of Vnc (φ, p, μ, g) gives the expected value function in the next period, weighted by the ratio of marginal utility of consumption across periods, q(μ, g, μ , g ) = C(μ, g)/C(μ , g ). The transition probabilities across exogenous states are represented by the measure dλ(φ , g |φ, g). The law of motion for the distribution of firms, μ (μ, g ), is an object to be determined in equilibrium. VII.B. Definition of Equilibrium Conditional on the money growth and technology processes, I define a competitive recursive equilibrium as a pair of value functions {Vnc , Vc }, a collection of aggregate objects functions {P, C, W, N}, a pricing function p, ˜ and a law of motion μ such that (a) {Vnc , Vc } is a fixed point of the system formed by equations (1) and (2) given P, C, W, p˜ and μ , (b) p˜ is optimal given P, W, and Vnc , (c) C and N solve the household’s problem given P and W, (d) μ correctly represents the law of motion of the distribu˜ and (e) the labor, consumption, tion of firms given Vnc , Vc , and p, individual items, and money markets clear. All equilibrium objects are functions of the joint distribution of individual prices and technologies, which is an infinitedimensional object. In order to make the computation of an equilibrium amenable to standard solution techniques, I follow Krussel and Smith (1998) and approximate μ using a selection of its moments. A solution is then computed using value function iterations. Details of the solution method and model calibration are relegated to Appendix II.21 VII.C. Main Predictions of the Model I first investigate whether the model is able to correctly predict the level of the frequency and average magnitude of price 21. The C programs, detailed calibrations, and instructions to replicate the results are available as an Online Appendix on the author’s website.
1250
QUARTERLY JOURNAL OF ECONOMICS
changes over a range of inflation similar to that experienced by Mexico. For computational and expositional convenience, I focus initially on a version of the model with nonstochastic money growth, so that all aggregate objects and the joint distribution of individual prices and technologies are constant over time. The findings are robust to considering changes in trend money growth in the stochastic–money growth version of the model. The model is calibrated to match the average frequency and absolute magnitude of price changes over the last three years of the sample, a period when annual inflation averaged 5.0%. The model’s predictions are then recorded for steady-state inflation rates ranging from 0% to 50%, holding all other parameters constant. In addition to calibrating the model using the full sample of items, I also considered separate calibrations for the subsamples of goods and services. Turning first to the results for the full sample, the upper left panel of Figure VII indicates that the model matches remarkably well the average frequency of price changes at various levels of inflation. The diamonds, squares, and triangles represent the average monthly frequencies of price changes, increases, and decreases, respectively, for each calendar year in my sample. The lines indicate the corresponding predictions of the model. As steady-state inflation is increased from a low to a high level, the model produces an initially slow rise in the monthly frequency of price changes, similar in magnitude to the data. The model also fares well at reproducing the underlying opposite movements in the frequency of price increases and decreases. The frequency of price increases rises steadily and almost linearly in the model over the range of steady-state inflation considered. The corresponding decline in the frequency of price decreases is fastest at low levels of inflation. Overall, my calibration predicts that the frequency of price increases is more sensitive to a change in steady-state inflation than the frequency of price decreases, consistent with the evidence presented in Section V. The model fits equally well the average magnitude of price changes, as shown in the upper right panel of Figure VII. When inflation is low, the average magnitude of price changes responds almost linearly to a change in steady-state inflation, a counterpart to the weak response of the frequency of price changes. In contrast, a change in inflation has little impact on the average absolute magnitude of price increases and decreases. The panel thus hints that in the model, as is the case in the data, the high correlation
1251
PRICE SETTING DURING LOW AND HIGH INFLATION a) Frequency all items
b) Magnitude all items
50
15
40 10 %
%
30 20
5
10 0
0
10
20
30
40
0
50
0
10
annual inflation (%)
20
30
40
50
40
50
annual inflation (%)
c) Frequency goods
d) Magnitude goods
50
15
40 10 %
%
30 20
5
10 0
0
10
20
30
40
0
50
0
10
annual inflation (%)
20
30
annual inflation (%)
e) Frequency services
f) Magnitude services
50
15
40 10 %
%
30 20
5
10 0
0
10
20
30
40
50
0
0
annual inflation (%)
10
20
30
40
50
annual inflation (%)
changes (model)
increases (model)
decreases (model)
changes (data)
increases (data)
decreases (data)
FIGURE VII Average Frequency and Magnitude of Price Changes, Increases, and Decreases Predicted by the Model The lines in the panels display the model’s predicted frequency (left-hand panels) and average magnitude (right-hand panels) of price changes, increases, and decreases at various levels of annual inflation. The first, second, and third rows of panels show separate model calibrations using all items in the sample, all goods, and all services, respectively. For each calendar year in each subsample, the diamonds, squares, and triangles show the corresponding sample annual averages for price changes, increases, and decreases, respectively.
1252
QUARTERLY JOURNAL OF ECONOMICS
between the average magnitude of price changes and the level of inflation is driven primarily by shifts in the relative occurrence of price increases and decreases, not by changes in the absolute size of nominal adjustments. The panels on the second and third rows of Figure VII present corresponding sample and model statistics for the goods and services subsamples, respectively. Given that goods account for a large proportion of price changes in the full sample, the model’s fit of the goods subsample is very similar to the model’s fit of the entire sample. Despite important differences between goods and services price-setting behavior, the model matches reasonably well the annual statistics in the subsample of services. The calibration for services entails a smaller variance of technological innovations (σε2 = 0.0011) and larger menu costs (ξ = 0.0075) than the calibration for goods (for which σε2 = 0.0032 and ξ = 0.0035). As a result, services price changes in the model are relatively infrequent and mainly positive whenever inflation exceeds a few percentage points. Moreover, the model is consistent with greater sensitivity of the frequency of services price changes when steadystate inflation moves up from a low level. I next investigate whether the model is consistent with the dynamic properties of aggregate inflation, in particular with the respective roles of movements in the frequency and magnitude of price changes in accounting for variations in the monthly inflation series. In order to do so, I assume that money growth follows a Markov-switching process, as described in Appendix II, and conduct two experiments. In the first experiment, I calibrate the money growth process so that the model matches the mean, standard deviation, and persistence of inflation in the full sample over the low-inflation period shown in Table III. I then compute the inflation variance decompositions proposed by Klenow and Kryvtsov (2008) and compare the model’s predictions to the data. The procedure is repeated for the relatively high-inflation period of January 1995 to June 1999. In the second experiment, I hold constant the variance and persistence of the money growth process at its calibrated values in the low-inflation episode. I then compute the inflation variance decompositions for levels of trend inflation ranging from 0% to 50%. Although factors not captured by my simple one-sector monetary model surely had an influence on inflation in my sample, these experiments highlight key differences in the dynamic properties of low and high inflation found both in the model and the data.
PRICE SETTING DURING LOW AND HIGH INFLATION
1253
When calibrated to the low-inflation period in Table III, the model predicts that the share of variance accounted for by the intensive margin is 94.3%, a proportion very similar to that found in the data (89.2%). All components of the extensive margin are relatively small, which indicates that movements in the frequency of price changes account for little of the variance of inflation during the low-inflation period. The relative unresponsiveness of the frequency in the model when inflation is low is a clear indication that the high share of inflation variance accounted for by the intensive margin in the data cannot be taken as prima facie evidence in favor of models in which the frequency of price changes is exogenously constant. This result echoes the conclusions of Golosov and Lucas (2007), who calibrated their model to match the recent U.S. inflation experience. When calibrated to the January 1995 to June 1999 period, a period of relatively high and volatile inflation, the model’s predicted extensive margin share of inflation variance is 57.4%. This figure is subtantially lower than that over the low-inflation period but higher than that in the data (34.7%). The second experiment indicates that the importance of intensive margin in the model depends importantly on the level of trend inflation. Holding constant the variance and persistence of money growth innovations, the share of inflation variance attributed to the intensive margin is 99.7% when inflation is zero, a proportion that falls to 46.9% and 17.1% once trend inflation reaches 25% and 50%, respectively. The presence of idiosyncratic shocks is a key factor behind the model’s good empirical performance over a wide range of inflation. First, it ensures that nonzero price changes, both positive and negative, are observed when aggregate inflation is near zero. Second, the occurrence of price increases and decreases typically respond in opposite directions in the face of inflationary shocks. The resulting variation in the composition of nominal adjustments has dampening effects on the frequency of price changes while making the average magnitude of price changes relatively responsive to inflation. Third, the distribution of idiosyncratic shocks helps the model match the absolute size of price increases and decreases over a wide range of inflation. In the steady state of the Sheshinski and Weiss (1977) menu-cost model, which has no idiosyncratic shocks, all price changes are positive whenever inflation is positive. The average size of price changes is directly determined by the width of the (one-sided) Ss band, which is a strictly increasing function of steady-state inflation. As shown by Rotemberg (2008),
1254
QUARTERLY JOURNAL OF ECONOMICS
a change in steady-state inflation generates an implausibly large change in the width of the Ss band in this model under standard specifications of demand. When idiosyncratic shocks are added to the environment, however, the width of the Ss band is no longer the sole determinant of the average magnitude of price changes: Shifts in the relative occurrence of price increases and decreases also can have an impact. In my calibrations, these shifts play a central role, whereas the width of the Ss band is relatively insensitive to the level of inflation. As a result, the model produces both a strong response of the average magnitude of price changes to inflation and a weak response of the average absolute size. Danziger (1999) provides an interesting example of a dynamic general-equilibrium model with menu costs and idiosyncratic shocks in which the Ss band is independent of the money growth process. As I have shown, a menu-cost model with idiosyncratic technology shocks is consistent with some key features of individual consumer price setting at both low and relatively high levels of inflation. Despite this success, the model suffers from several known inconsistencies with the data. As discussed by Golosov and Lucas (2007) and several others, the model generates too few small price changes and has an amount of intrinsic persistence that is too small compared to the empirical evidence on the transmission of monetary shocks. Nevertheless, my findings offer hope that versions of the model eventually addressing these shortcomings, such as perhaps versions in which the hypotheses of constant and time-invariant menu costs are relaxed, might continue to provide a good fit of the average magnitude and frequency of price changes if they embed a distribution of positive and negative price changes. Midrigan (2006) offers an interesting exploration along those lines in an environment with multiproduct firms. VIII. CONCLUSION In this paper, I provide new evidence on the setting of individual consumer prices under low and high inflation. To do so, I assembled a large data set of store-level prices that is representative of over half of Mexican consumer expenditures. The number of observations in my sample, over thirty thousand per month, is an order of magnitude larger than in other high-inflation studies currently available. Moreover, the data set covers periods of both low and high inflation as well as the transition between the two.
PRICE SETTING DURING LOW AND HIGH INFLATION
1255
The sample starts in January 1994, one year before the Mexican peso crisis and the sharp increase in inflation that accompanied it, and ends in June 2002, a few years after inflation had been successfully stabilized at a low level. Throughout the discussion, I focus on a decomposition of inflation into the frequency and the average magnitude of nonzero price changes. I find some key differences between the low- and highinflation periods in my sample. When inflation is low (below 10%– 15%), most of the adjustment to inflation occurs through changes in the average magnitude of price changes. The latter is connected to inflation by a tight and near-linear relationship. The frequency of price changes, on the other hand, is only weakly correlated with inflation. By contrast, when inflation is high (above 10%– 15%), both the frequency and the magnitude of price changes comove strongly and positively with inflation. Breaking down price changes into positive and negative adjustments helps understand differences in the behavior of inflation over the low- and highinflation portions of my sample. As inflation rises from a low level, positive nominal adjustments become increasingly common, whereas negative ones become less so. These opposite effects of inflation on the frequency of price increases and decreases dampen variations in the overall frequency of price changes. As inflation rises further to a high level, the frequency of price increases continues to rise steadily with inflation. The rate of decline in the frequency of price decreases moderates, however, as few price decreases are observed in the economy, and the frequency of price changes then comoves strongly with inflation. One important challenge is to design price-setting models that offer empirically plausible predictions at both low and high levels of inflation. The baseline Taylor (1980) and Calvo (1983) models, which are widely used in the literature, assume that the frequency of price changes is constant over time. This assumption is clearly problematic for an environment in which inflation is as volatile as in Mexico over my sample period because both the frequency and magnitude of price changes displayed large variations. I show that a menu-cost model with idiosyncratic technology shocks, on the other hand, predicts remarkably well the level of the average frequency and magnitude of price changes over a wide range of inflation. The joint presence of menu costs and idiosyncratic shocks is key to this good fit. Menu costs ensure that nominal adjustments are lumpy and infrequent, two characteristics of price changes shared by most goods and services
1256
QUARTERLY JOURNAL OF ECONOMICS
in my sample. The addition of idiosyncratic shocks helps generate a distribution of both positive and negative price changes that is free to move with inflation. Consistent with the data, my calibration of the model predicts that a large number of price increases and decreases will be observed when inflation is close to zero. As inflation increases, the model generates a steady rise in the occurrence of price increases and a simultaneous decline in the occurrence of price decreases, which moderates as negative price changes rapidly dissipate. The model is consistent with a modest role for the frequency of price changes when inflation is low and stable, and a relatively important role when inflation reaches high levels, even holding constant the variance of money shocks. Note that the baseline Calvo and Taylor models are not inconsistent with infrequent and lumpy nominal adjustments and the simultaneous presence of positive and negative price changes; one could simply augment these models with idiosyncratic shocks. Unless the assumption of a constant frequency of price changes is relaxed, however, these models will inexorably fail to capture an important margin by which individual prices adjust from a lowto a high-inflation environment.
APPENDIX I: PRICE AVERAGING In Mexico, price collectors visit outlets four times every month to collect prices of food items, and they visit twice per month to collect prices for all other items. The prices published in the Diario are an average of the prices collected over the month. In this Appendix, I first discuss how observing a price’s average rather than its actual value at a particular point in time impacts inference about the timing and magnitude of price changes. I then describe how I filtered the data to make the results in this paper more directly comparable to those from studies using prices collected once per month. A. Effects of Averaging on Frequency and Magnitude Suppose a price collector observes the price of an item twice every month and then computes two monthly time series. The first series is a simple average of the two prices collected over the month (the average-price series). The other series contains the actual price observed at the second visit (the point-in-time series). The average-price series corresponds to Banco de M´exico’s
PRICE SETTING DURING LOW AND HIGH INFLATION
1257
current method, whereas point-in-time series are used in the United States and euro area. Changes to the monthly average-price series typically are more frequent and smaller on average than changes to the monthly point-in-time series. To illustrate this point, consider an item whose price is constant over the months t − 1 to t + 1, with the exception of a single adjustment at t. If the price change occurs before the price collector’s first visit at month t, then both prices collected over that month equal the new price. In that case, the average-price and the point-in-time series are identical and correctly reflect the timing and magnitude of the actual price adjustment. Similarly, if the price change occurs after the price collector’s second visit during month t, then both the average-price and point-in-time series display a unique price change of the correct magnitude detected at month t + 1. Only when the price change takes place between the two price collections do the average-price and point-in-time series differ. When this is the case, the point-in-time series still accurately matches the timing and size of the actual change in the item’s price. The average-price series, on the other hand, displays two price changes: one at month t and one at t + 1. The second price change is recorded because the average price at t has increased by only half the change in the actual price. If several price changes occur within a month, then both the average-price and the point-in-time series provide incomplete descriptions of individual price changes. The change in the end-ofperiod series corresponds to the cumulative change in the price over the period, whereas the change in the average-price series reflects the change in the average price over the previous period. Under both approaches, the change in the monthly series may be smaller than, equal to, or larger than the price changes that would be recorded were prices collected continuously. B. Filtration of Average Price Trajectories In the above example, a price change occurring between two price collections created two consecutive price changes of equal magnitude in the average-price series. My filtering strategy entails finding such patterns and then constructing a series for the last price collected during the month that minimizes the number of price changes and is consistent with the observed average prices. More formally, let pti be the prices of a nonfood item recorded during the price collector’s ith visit at month t. The
1258
QUARTERLY JOURNAL OF ECONOMICS
published monthly average is p¯ t = ( pt1 + pt2 )/2. Consider the case of two consecutive changes in the published average-price series starting at month t. If both changes have the exact same magnitude, that is, if p¯ t =
(3)
p¯ t+1− p¯ t−1 , 2
then I construct a sequence of semimonthly observations, {( pτ1 , pτ2 )}t+1 τ =t , that is consistent with the observed average-price sequence { p¯ τ }t+1 τ =t and features no price change at t + 1. I simply assume that a single nominal adjustment was detected at the collector’s second visit, so that pt1 = p¯ t−1 and pt2 = p¯ t+1 . Whenever the filter finds such a pattern, it replaces p¯ t by p¯ t+1 , thus eliminating a potentially spurious price change. A similar approach is used when the published average-price series features up to four consecutive price changes. Considering longer sequences, which are very few, did not improve the fit. The filtering of individual food prices, which are collected four times per month, follows the same approach. Suppose the price of a food item is constant during month t − 1, changes once during month t, and then stays constant during month t + 1. If the actual price change is recorded at the collector’s second, third, or fourth visit, then the published average price is (4)
p¯ t =
(5 − dt ) p¯ t−1 + (dt − 1) p¯ t+1 4
for some visit time dt ∈ {2, 3, 4}. The change in the published price at t is exactly 1/4, 1/2, or 3/4 the total change from t − 1 to t + 1. For such cases, the filter replaces p¯ t by p¯ t+1 in the same way as for nonfood items. When more than two consecutive price changes are observed, there may be more than one combination of detection and end-of-period prices that is consistent with times {dt+τ }t+N−1 τ =t the published average-price series and features no price change in the last period. Filtering these longer sequences of food prices did not appear to improve accuracy.22 C. Example of Individual Price Trajectory and Filtering Figure A.1 illustrates how the filtering procedure is implemented. The upper panel displays two years of monthly average prices, along with the corresponding price changes, for a copy of 22. In practice, the filter verifies that both sides of equations (3) and (4) are within 0.005 of each other to allow for possible price rounding.
PRICE SETTING DURING LOW AND HIGH INFLATION
1259
(a) Unfiltered series 60 50
Price Price change
$
40 30 20 10 0
(b) Filtered series 60 50
$
40 30 20 10 0
FIGURE A.1 Illustration of a Price Trajectory Correction The upper and lower panels display, respectively, the unfiltered and filtered price trajectories of a single copy of the book The Universal History of Literature sold in a Mexico City outlet.
the book The Universal History of Literature, sold in a Mexico City outlet. From January 1994 to December 1995, there were six changes to this unfiltered series. The first happened in August 1994 when the average price increased from $23 to $25. Because the average price remained at $25 in September, the filter leaves the series unchanged. The next two changes occurred in January and February 1995. The published price for January, $28.50, is the exact average of the published prices for December and February ($25 and $32, respectively), a fact consistent with the occurrence of a single change in the actual price from $25 to $32 during the second half of January. The last three price changes occurred in May, June, and July of 1995; the published price increased from $32 to $36.50, then to $47, and finally to $53. This sequence is consistent with a change in the actual price from $32 to $41 after the first price collection in May and then from $41 to $53 after the first price collection in June. The filtered series corresponding to the last observation of each month is displayed at the bottom of
1260
QUARTERLY JOURNAL OF ECONOMICS
Figure A.1. It contains only four nonzero price changes, and their magnitude is typically equal to or greater than that of those in the unfiltered series. D. Discussion of the Filter Banco de M´exico provided me with unpublished semimonthly observations for the months of October and November 2006. These data allow me to compare directly filtered and unfiltered prices of nonfood items to a monthly series of actual end-of-period prices, which the filters aim to reproduce. They can also be used to assess whether individual prices change more than once per month. In order to perform these checks, I first extended the database to December 2006. I then created unfiltered and filtered series in the same way I do for the main sample.23 For nonfood items, published prices were identical to the endof-period prices observed by price collectors in the vast majority of cases (92.3%, weighted by product categories). The filter correctly left all but a tiny fraction (0.02%) of these exact observations unchanged. For the remaining cases (7.7%), published prices differed from actual end-of-period prices. Of these observations, 54.6% were left unchanged, 44.6% were assigned values that correctly matched end-of-period prices, and 0.9% were replaced by values that still did not match actual end-of-period prices. In short, filtering brings nonfood average-price data closer to actual end-of-period prices while introducing very few mistakes. More importantly, the use of the filtered series offers a very good approximation of the frequency of price changes. In 98.4% of nonfood observations, the filtered series either indicated a price change when one was observed in the actual end-of-period series, or indicated no price change when none was observed. The majority of the diverging cases is related to within-month sales for which an end-of-period series, contrary to an average-price series, fails to signal the occurrence of a price change. The filtered series also offers a better description of nonzero price changes than the unfiltered series. The change in the unfiltered series matched the change in the end-of-period series of actual prices in about a third of all cases, a proportion that rises to 62.0% after filtering. About half of the remaining cases displayed two price changes within the month, in which case inference about the magnitude of individual price changes is problematic under any monthly series. This 23. Semimonthly data are not available for the sample period considered in the paper. Individual items cannot be linked before and after June 2002 due to a change in the nomenclature of item keys.
PRICE SETTING DURING LOW AND HIGH INFLATION
1261
discussion illustrates the need to use data collected frequently, such as scanner data, in order to measure precisely the magnitude of individual price changes for items with a high frequency of price changes. In the case of food prices, the unpublished semimonthly data are averages of two weekly prices. A direct comparison with actual price observations is therefore impossible. Inference about the magnitude of individual price changes is likely to be less accurate than for nonfood observations because a much larger proportion of food items witness several price changes within the month. For example, Campbell and Eden (2007) report a probability of nearly 50% that a price will change in the week following a price change for a sample of food products sold in U.S. grocery stores. Filtering the data has almost no effect on the main patterns reported in the paper. The number of price changes filtered out in any given period is roughly proportional to the number of price changes observed in the unfiltered series. Consequently, movements in the unfiltered and filtered average frequency of price changes are highly correlated. Similarly, filtering raises the average (absolute) magnitude of price increases and decreases rather uniformly. APPENDIX II: SOLUTION METHOD AND CALIBRATION OF THE MODEL A. Solution Method All equilibrium objects are approximated using Chebyshev polynomials. The equilibrium solution is obtained in three steps. First, I guess the aggregate objects functions, {P (0) , C (0) , W (0) , (0) , Vc(0) , p˜ (0) } by value N (0) , μ(0) }, and compute the associated {Vnc function iteration. Second, I generate a long Markov chain condi(0) , Vc(0) , p˜ (0) }, randomly sampling in every period a money tional {Vnc growth shock and a distribution of technological innovations. I then compute new approximations of the aggregate objects functions {P (1) , C (1) , W (1) , N (1) , μ(1) } based on the Markov chain and compare them to my initial functions. If they differ, I update my guess and repeat the procedure until numerical convergence. B. Selection of Moment(s) I use as an approximation of μ the average log deviation of individual prices from their optimum,
μ≈
log
pi p˜ i
di.
1262
QUARTERLY JOURNAL OF ECONOMICS
I find that further approximating the law of motion of μ and the aggregate objects as linear functions of g and one lag of μ provides a high degree of accuracy. For example, the correlation between μ and C in the Markov chain and their predicted value is 0.997 and 0.948, respectively, based on my calibration to the July 1999 to June 2002 period. C. Money Growth Process I assumed that money growth can take two values, gl = g¯ − δ and gh = g¯ + δ, with a constant probability 1 − a of switching between states every period. This simple Markov switching process has several appealing properties. First, key moments can be easily derived, in particular E[g] = g, ¯ Var(g) = δ 2 , and ρ(g , g) = 2a − 1. Each moment is controlled by a single parameter, thus facilitating the calibration of the model. Second, although money growth can take only two values, the distribution of inflation outcomes is somewhat richer because it also depends on the joint distribution of technology and prices. Finally, this simple specification keeps the problem computationally manageable. D. Calibration Some parameters of the model are taken directly from the literature, whereas others are chosen to match particular moments of the distribution of price changes. For the elasticity of substitution across items, I pick a value of 7, the same as Golosov and Lucas. The discount factor is set to (1.05)−1/12 . The persistence of technology shocks is set to 0.75, a value similar to that implied by Golosov and Lucas’ (2007) quarterly calibration (0.551/3 ≈ 0.82 per month) but higher than Midrigan’s (2006) (0.5 per month). The utility parameter ψ is chosen so that households work exactly 25% of the time absent menu costs. In the nonstochastic–money growth version of the model, the variance of technological innovations, σε2 , the size of menu costs, ξ¯ , and money growth, g, ¯ are chosen to match the average frequency and absolute magnitude of nonzero price changes over the last three years of the sample, as well as average inflation. In the stochastic money growth version, σε2 , ξ¯ , δ, and a are picked so that the model additionally matches the variance and persistence of inflation over the low- and high-inflation periods shown in Table III. All calibrated parameters and C programs are available as an Online Appendix.
PRICE SETTING DURING LOW AND HIGH INFLATION
1263
BOARD OF GOVERNORS OF THE FEDERAL RESERVE SYSTEM
REFERENCES Ahlin, Christian, and Mototsugu Shintani, “Menu Costs and Markov Inflation: A Theoretical Revision with New Evidence,” Journal of Monetary Economics, 54 (2007), 753–784. Baharad, Eyal, and Benjamin Eden, “Price Rigidity and Price Dispersion: Evidence from Micro Data,” Review of Economic Dynamics, 7 (2004), 613–641. Baudry, Laurent, Herv´e Le Bihan, Patrick Sevestre, and Sylvie Tarrieu, “What Do Thirteen Million Price Records Have to Say about Consumer Price Rigidity?” Oxford Bulletin of Economics and Statistics, 69 (2007), 139–183. Bils, Mark, and Peter J. Klenow, “Some Evidence on the Importance of Sticky Prices,” Journal of Political Economy, 112 (2004), 947–985. Burstein, Ariel, Martin Eichenbaum, and Sergio Rebelo, “Large Devaluations and the Real Exchange Rate,” Journal of Political Economy, 113 (2005), 742–784. Calvo, Guillermo A., “Staggered Prices in a Utility-Maximizing Framework,” Journal of Monetary Economics, 12 (1983), 383–398. Campbell, Jeffrey R., and Benjamin Eden, “Rigid Prices: Evidence from U.S. Scanner Data,” Federal Reserve Bank of Chicago Working Paper Series No. 200508, 2007. Danziger, Leif, “A Dynamic Economy with Costly Price Adjustments,” American Economic Review, 89 (1999), 878–901. ´ Dhyne, Emmanuel, Luis J. Alvarez, Herv´e Le Bihan, Giovanni Veronese, Daniel ¨ Dias, Johannes Hoffmann, Nicole Jonker, Patrick Lunnemann, Fabio Rumler, and Jouko Vilmunen, “Price Setting in the Euro Area: Some Stylized Facts from Individual Consumer Price Data,” European Central Bank Working Paper Series No. 524, 2005. ——-, “Price Changes in the Euro Area and the United States: Some Facts from Individual Consumer Price Data,” Journal of Economic Perspectives, 20 (2006), 171–192. Eden, Benjamin, “Inflation and Price Adjustment: An Analysis of Microdata,” Review of Economic Dynamics, 4 (2001), 607–636. Edwards, Sebastian, “The Mexican Peso Crisis: How Much Did We Know? When Did We Know It?” World Economy, 21 (1998), 1–30. Golosov, Mikhail, and Robert E. Lucas, “Menu Costs and Phillips Curves,” Journal of Political Economy, 115 (2007), 171–199. Klenow, Peter J., and Oleksiy Kryvtsov, “State-Dependent or Time-Dependent Pricing: Does It Matter for Recent U.S. Inflation?” Quarterly Journal of Economics, 123 (2008), 863–904. Konieczny, Jerzy D., and Andrzej Skrzypacz, “Inflation and Price Setting in a Natural Experiment,” Journal of Monetary Economics, 52 (2005), 621–632. Krusell, Per, and Anthony A. Smith, “Income and Wealth Heterogeneity in the Macroeconomy,” Journal of Political Economy, 106 (1998), 867–896. Lach, Saul, and Daniel Tsiddon, “The Behavior of Prices and Inflation: An Empirical Analysis of Disaggregated Price Data,” Journal of Political Economy, 100 (1992), 349–389. Midrigan, Virgilu, Menu Costs, Multi-Product Firms, and Aggregate Fluctuations, unpublished paper, New York University, 2006. Nakamura, Emi, and J´on Steinsson, “Five Facts About Prices: A Reevaluation of Menu Cost Models,” Quarterly Journal of Economics, 123 (2008), 1415–1464. Organisation for Economic Co-operation and Development (OECD), OECD Economic Surveys, 1999–2000: Mexico, 2000 (Paris and Washington, DC: OECD, 2000) Papke, Leslie E., and Jeffrey M. Wooldridge, “Econometic Methods for Fractional Response Variables with an Application to 401(K) Plan Participation Rates,” Journal of Applied Econometrics, 11 (2006), 619–632. Rotemberg, Julio J., Fair Pricing, unpublished manuscript, Harvard Business School, 2008. Sheshinski, Eytan, and Yoram Weiss, “Inflation and Costs of Price Adjustment,” Review of Economic Studies, 44 (1977), 287–303. Taylor, John B., “Aggregate Dynamics and Staggered Contracts,” Journal of Political Economy, 88 (1980), 1–23.
JOB DISPLACEMENT AND MORTALITY: AN ANALYSIS USING ADMINISTRATIVE DATA∗ DANIEL SULLIVAN AND TILL VON WACHTER We use administrative data on the quarterly employment and earnings of Pennsylvanian workers in the 1970s and 1980s matched to Social Security Administration death records covering 1980–2006 to estimate the effects of job displacement on mortality. We find that for high-seniority male workers, mortality rates in the year after displacement are 50%–100% higher than would otherwise have been expected. The effect on mortality hazards declines sharply over time, but even twenty years after displacement, we estimate a 10%–15% increase in annual death hazards. If such increases were sustained indefinitely, they would imply a loss in life expectancy of 1.0–1.5 years for a worker displaced at age forty. We show that these results are not due to selective displacement of less healthy workers or to unstable industries or firms offering less healthy work environments. We also show that workers with larger losses in earnings tend to suffer greater increases in mortality. This correlation remains when we examine predicted earnings declines based on losses in industry, firm, or firm-size wage premiums.
I. INTRODUCTION A growing literature shows that displaced workers— individuals who lose their jobs as part of plant closings, mass layoffs, and other firm-level employment reductions—tend to experience significant long-term earnings losses as well as decreased job stability, lower employment rates, earlier retirement, lower consumption, and decreased health insurance coverage.1 In this ∗ We would like to thank Doug Almond, Marianne Bertrand, David Card, Janet Currie, Eric French, Ed Glaeser, Michael Greenstone, Larry Katz, Wojciech Kopczuk, David Lee, Adriana Lleras-Muney, Claudio Lucifora, Chris Paxson, Jesse Rothstein, Chris Ruhm, Jon Skinner, and two referees for helpful suggestions. Seminar participants at the SOLE 2006 Meetings, the NBER 2006 Summer Institute, the Milan Mills Workshop, London School of Economics, University College London, Pompeu Fabra Barcelona, Boston University, the University of Illinois Urbana–Champaign, Harris School, Harvard University, Columbia University Social Work, the University of California San Diego, the University of California Santa Barbara, the University of California Berkeley, Tufts University, Oxford University, UCL Epidemiology, Princeton University, Prague CERGEI, the Catholic University of Milan, the University of North Carolina Greensboro, the University of Texas Houston, Texas A&M University, RWI Essen, the Federal Reserve Bank of Chicago, and Columbia University provided helpful comments. Alice Henriques and Phil Doctor provided excellent research assistance.
[email protected],
[email protected]. 1. See, for example, Ruhm (1991); Olson (1992); Jacobson, LaLonde, and Sullivan (1993); Gruber (1997); Stevens (1997); Chan and Stevens (2001); and Farber (2003). The Bureau of Labor Statistics defines displaced workers to be individuals who lose their main jobs because of the operating decisions of their employers, where in the case of multiple jobs “main job” refers to the job held the longest (see, e.g., Hildreth, von Wachter, and Handwerker [2008]). C 2009 by the President and Fellows of Harvard College and the Massachusetts Institute of
Technology. The Quarterly Journal of Economics, August 2009
1265
1266
QUARTERLY JOURNAL OF ECONOMICS
paper, we provide evidence that displaced workers can also experience higher rates of mortality. To study the link between displacement and mortality, we use administrative data on earnings and employment histories for male workers from Pennsylvania in the 1970s and 1980s matched to Social Security Administration (SSA) death records covering the entire United States from 1980 to 2006. Following Jacobson, LaLonde, and Sullivan (1993) (hereafter JLS), we identify instances of displacement as those in which high-tenure workers leave firms experiencing large employment declines.2 We then compare these displaced workers’ subsequent mortality rates with those of similar workers who did not suffer job loss. We find that high-tenure male workers displaced during the early and mid-1980s experienced a significant increase in mortality. Indeed, our estimates suggest a 50%–100% increase in the mortality hazard during the years immediately following job loss. The estimated impact of displacement on annual mortality rates declines substantially over time, but appears to converge to a 10%–15% increase in the hazard rate. If these increases lasted beyond the 25-year window we follow, they would imply a loss in life expectancy of 1.0–1.5 years for workers displaced in middle age. In contrast, we find little effect of job loss on mortality for workers displaced near retirement age. Firm-level employment declines should be exogenous to individual workers’ health developments. Moreover, our results control for the mean and standard deviation of workers’ earnings over a period of several years prior to job loss and are robust to the inclusion of industry or firm effects. They should thus be little affected if, for example, firms selectively lay off less productive workers and less productive workers tend to be less healthy, or if unstable industries or firms provide less healthy work environments. In addition, we show that these worker-level results are consistent with a firm-level analysis of the impact of employment declines on mortality that pools displaced workers with those remaining with affected firms. By construction, these “intent-to-treat” estimates are unaffected by the possibility of firms selecting the least healthy workers for layoffs or by misclassification of dying workers as job losers. Thus, our estimates likely identify the causal effect of job loss on mortality. 2. We analyze workers with at least six or at least three years of tenure at the time of job displacement.
JOB DISPLACEMENT AND MORTALITY
1267
Our estimates of the short- and long-run effects of displacement on the mortality hazard roughly parallel the short- and longrun effects of displacement on earnings and employment reported in JLS and elsewhere. In the short run, displacement is associated with a sharp drop in mean earnings, increased unemployment, and high earnings instability. Our results are consistent with these effects causing acute stress, which may substantially raise the mortality hazard in the short term. In the long run, displacement is associated with a substantial drop in mean earnings and modestly higher employment instability and earnings variability. Several economic models of health determination predict that a decline in lifetime resources should raise mortality.3 Our empirical findings are consistent with a reduction in such resources leading to reduced investments in health or chronic stress, which, in turn, lead to a smaller, but longer term increase in the mortality hazard. Increased earnings instability may also contribute to chronic stress and a long-run increase in mortality.4 To gain insight into the relative importance of some of the channels through which job loss could affect the long-run mortality hazard, we compare our estimates of the “reduced form” effect of displacement on mortality to what one would expect on the basis of displacement’s long-run effect on the mean and variability of workers’ earnings and the correlation of those factors with mortality. In our Pennsylvania data, displacement reduces the mean of long-run earnings by 15%–20%. Given the correlation of mean earnings with mortality, this effect can explain an increase in the death rate hazard equal to 50%–75% of our estimate of the reduced-form effect. Though displacement does not have a significant long-run effect on employment rates, it does raise the variability of earnings somewhat. Given the significant correlation of earnings variability with mortality in our data, this implies an additional effect of displacement on mortality on the order of 20%–25% of what we estimate for the full reduced-form effect of job loss on mortality. Thus, the impact of displacement 3. For example, a shift in the lifetime budget constraint would reduce health investments in a neoclassical model of health; alternatively, it could reduce social status and may raise mortality through social stress (see Deaton [2001] for further discussion of these and other approaches). Although some of these factors are likely to operate in the short run as well, too many factors vary simultaneously for the effect of any single channel to be separately assessed. 4. Another potential channel to which our data does not speak is the loss in health insurance. Losses in health insurance may be correlated with earnings reductions (Olson 1992) but may have independent effects as well.
1268
QUARTERLY JOURNAL OF ECONOMICS
on the mean and variability of earnings may explain an important fraction of the increase in the long-run mortality hazard that we estimate. Our analysis of groups of workers who by their industry or their employer’s characteristics have greater predicted earnings losses confirms that larger earnings reductions at job displacement are associated with greater increases in long-term mortality risk. Our results are consistent with those of the large literature documenting a strong correlation of socioeconomic status with health.5 However, our paper is one of the first studies to use U.S. data to estimate the long-term effect of a plausibly exogenous labor market event on an objective measure of health for a large group of workers. It thereby establishes a much clearer causal link between labor market and health outcomes than most of the previous literature.6 Our study complements important recent studies based on European administrative data, which find mixed results on the effects of job loss on health.7 Our paper is the only study to closely replicate and extend the approach used in JLS’s well-known analysis of job displacements. In addition to methodological differences, the European studies differ from ours in that they analyze the effect of displacement over shorter horizons. In addition, U.S. health care and labor market institutions differ substantially from those in Europe, where workers often have access to universal health insurance and where the 5. Typical estimates suggest a strong correlation between income and mortality (e.g., Deaton and Paxson [1999]). In addition, a growing literature in economics, sociology, and epidemiology has shown that unemployment and job loss correlate with the incidence of depression, low self-esteem, heart attack, and even suicide (see, e.g., Darity and Goldsmith [1996]; Burgard, Brand, and House [2005, 2007]; Gallo et al. [2006]). However, cross-sectional estimates may not represent causal effects of earnings on mortality because of reverse causality, omitted worker characteristics, and measurement error (e.g., Smith [1999]; Cutler, Deaton, and Lleras-Muney [2006]). 6. Most studies are not based on exogenous sources of variation in individuals’ labor market conditions, objective measures of health and job loss, large sample sizes, a long follow-up period, or detailed pre–job loss career outcomes such as used in our empirical work. This has made it difficult to study the effect of labor market events on health free of measurement error, reverse causality, and omitted variable bias. 7. Rege, Telle, and Votruba (forthcoming) find that workers (men and women) losing their jobs in a plant downsizing during 1993–1998 are more likely to receive disability insurance in 1999 and have a somewhat higher probability of death during 1999–2002. Eliason and Storrie (2007) find that male workers losing their jobs in establishment closures in Sweden during 1987–1988 experience excess mortality for up to four years after job loss. Martikainen, Maki, and Jantti (2007) find no such effects in Finland. Results are similarly mixed for other measures of health. For example, Kuhn, Lalive, and Zweimueller (2007) find that job loss reduces the mental health of men in Austria, whereas Browning, Dano, and Heinesen (2006) find no such effects in Denmark.
JOB DISPLACEMENT AND MORTALITY
1269
earnings consequences of job loss typically are less severe than in the United States.8 Our results do not conflict with those of Ruhm (2000), who finds that aggregate mortality rates tend to fall during recessions. As we discuss more fully in the conclusion, the situation of an individual displaced worker differs qualitatively from that of the average worker during a recession. Briefly, for the average worker, short-term declines in economic activity may increase time available for healthy activities without significantly reducing lifetime resources. However, the high-tenure displaced workers we study suffer significant long-term earnings reductions without benefiting from an offsetting increase in leisure time. A potential limitation of our data is that the experiences of male workers displaced from jobs in Pennsylvania during the early and mid-1980s may not be fully representative of those of the typical displaced worker. Indeed, given the severity of the early 1980s recession in Pennsylvania, it is quite possible that our results somewhat overstate the average impact of displacement on mortality. However, the qualitative effects of displacement on other aspects of workers’ lives have been found to be reasonably robust across time and place,9 so our results likely give a good indication of the direction and at least the rough magnitude of the effects that can be expected for the typical displaced male high-tenure worker. The next section discusses the properties of our data and introduces our econometric framework. Section III contains our main results; Section III.A presents the average effect of displacement on mortality; Section III.B distinguishes between the shortand long-run effects of displacement on mortality and breaks out the effects by current age, age at displacement, and job tenure; Section III.C discusses the implied reductions of life expectancy; and Section III.D summarizes our sensitivity analysis. Section IV discusses our assessment of potential mechanisms through which displacement raises mortality, and Section V concludes.
8. There is considerable heterogeneity in approaches and results among studies analyzing the effect of job loss on earnings in Europe. For example, the effects of job loss on earnings in Austria are small (Card, Chetty, and Weber 2007). Earnings losses in Sweden have been found to be more persistent (Eliason and Storrie 2006). Income losses in Norway fall somewhere in between (Rege, Telle, and Votruba, forthcoming). 9. Earnings losses of duration and magnitude similar to those found by JLS for Pennsylvania have been found in other states in the 1990s, such as California, Connecticut, or Massachusetts (in Schoeni and Dardia [2003], Couch and Placzek [forthcoming], and Kodrzycki [2007]), respectively, and for the entire United States during the early 1980s (von Wachter, Song, and Manchester 2009).
1270
QUARTERLY JOURNAL OF ECONOMICS
II. EMPIRICAL APPROACH: DATA AND ECONOMETRIC FRAMEWORK This section details the construction of our data set, which merges quarterly wage records derived from the state of Pennsylvania’s unemployment insurance (UI) system with death records maintained by the SSA. It also explains how we identify displaced workers and contrasts their characteristics with those of workers not affected by displacement. It then describes our basic empirical strategy, which is to compare the mortality experience of workers identified as being displaced with that of otherwise similar workers who are not displaced. II.A. Data Construction and the Characteristics of Displaced Workers Our data on workers’ employment and earnings histories are derived from the UI records of the state of Pennsylvania (PA) over the period from 1974 to 1991. For a 5% sample of workers who held jobs covered by UI, we observe quarterly earnings from each PA employer, as well as the employer’s industry.10 Our data on mortality are derived from a database compiled by the SSA and cover deaths occurring anywhere in the United States between 1974 and 2006. The accuracy of the death information has been found to be good for the sample of mature and older male workers we consider.11 We follow JLS in focusing on workers who had very stable employment relationships in the 1970s. Specifically, we analyze data on male workers who had the same principal employer from 1974 to 1979, where the principal employer for a year was the employer from which the worker received the most wage income. We also replicate our results for men with at least three years of job tenure in 1979. In both cases the restriction isolates stable workers separating from what they had reasons to expect to be 10. JLS (1993) used the same data for the period 1974 to 1986. For a detailed description of the data and their advantages and shortcomings, please see their paper. An “employer” in our data refers to a firm, which may operate multiple establishments, as long as they are in Pennsylvania. 11. The SSA’s Death Master File (DMF) is described and evaluated in Hill and Rosenwaike (2002). Coverage of the death data is better in the 1990s, for older workers, and for men. Recent work comparing the DMF with complete mortality data from the National Center for Health Statistics suggests that coverage for men is between 80% and 90% before age 65 and above 95% after age 65 (see extensive notes by Elizabeth Weber Handwerker, http://socrates.berkeley.edu/∼eweber/DMFnotes.htm). We replicated these tabulations for deaths in Pennsylvania for 1980–2002 and found similar results (in our empirical analysis, we also include deaths occurring in other states).
JOB DISPLACEMENT AND MORTALITY
1271
long-term jobs in the absence of mass layoffs. For these workers, displacement was likely to be unexpected and costly.12 A limitation of administrative data is that we do not have a direct measure of whether a particular separation was voluntary or involuntary. As in JLS, we deal with this limitation by defining displaced workers to be those who leave their firms during the period 1980–1986 and for whom their former firms’ employment in the following year was 30% or more below its peak since 1974.13 Other workers leaving their firms during this period are not considered displaced, and in most specifications are left in the comparison group. JLS found that such non-mass-layoff job separators did not, on the average, experience long-term earnings losses.14 Because we use percentage changes in firm employment to identify displaced workers and such changes are not very meaningful for small employers, we further limit our sample to those whose firms employed at least 50 workers in 1979. In addition, we restrict our analysis to male workers. During the period we study, there were relatively few female workers with such stable employment relationships. As a result, sample sizes are too small for meaningful findings for women to be derived. Again following JLS, we restrict some of our analysis to workers born between 1930 and 1959, a group for whom retirement before the 1990s is unlikely. However, for some analyses, which are noted below, we expand the age range to include workers born between 1920 and 1959. A potential concern with our procedure for identifying displacement is that workers who just happen to die in a year in which their firms substantially reduce employment will appear 12. Our sample is not meant to capture all job losers but maintains the focus on workers losing stable jobs that is common in the literature on job displacement (e.g., JLS [1993]; Schoeni and Dardia [2003]; Couch and Placzek [forthcoming]). Another reason to impose a minimal tenure restriction when working with administrative data is to exclude voluntary movers. This is discussed in detail in Hildreth, von Wachter, and Handwerker (2008). Hildreth, von Wachter, and Handwerker (2008) and von Wachter, Song, and Manchester (2009) show using administrative data that earnings losses from job displacements are substantial and long-lasting even at shorter tenure durations. 13. Hildreth, von Wachter, and Handwerker (2008) explore the issues that arise in the measurement of displacement using administrative data in detail and conclude that the results based on JLS’s way of identifying mass layoffs are robust to alternative definitions of mass layoffs. 14. These workers were excluded in JLS because, due to their uncertain layoff status, they may belong in the treatment group, in which case including them in the comparison group would underestimate the effects of displacement. On the other hand, if these workers are of worse underlying health, excluding them would bias our results upward. Thus, to err on the conservative side, we included them in our main sample as nondisplaced workers. We also show results based on the original JLS sample restriction.
1272
QUARTERLY JOURNAL OF ECONOMICS
to be job losers and, thus, displaced workers, even if they would have been able to retain their jobs had they lived. This misclassification of some dying workers as displaced rather than nondisplaced workers would tend to bias simple estimates of the effect of displacement on mortality upward. To address this problem, following a suggestion from a referee, we drop from our samples workers who died during the years their firms suffered mass layoffs. Because we find below that the effects of displacement tend to be largest immediately after job loss, this likely leads us to underestimate the average effect of displacement on mortality. The first three columns of Table I show means for a number of worker characteristics for the full sample just described, as well as for displaced and nondisplaced workers separately. Both groups of workers were, on the average, in their late thirties, with earnings in the middle of the income distribution for the period. Displaced workers were about half a year younger, had earnings about 6% lower, and experienced slightly more quarters without earnings in the 1974–1979 base period than nondisplaced workers. Displaced workers did, however, have somewhat faster earnings growth during the base period. In addition, during this period, displaced workers were employed by larger firms and were more likely to work in the steel industry or other durable-goods-producing industries. These patterns suggest that it is important to control for potential differences across sectors and firms, but also for pre-jobloss differences in career outcomes among movers and stayers. Despite having relatively similar earnings during the 1974– 1979 base period, displaced workers had much lower average earnings between 1987 and 1991. In part, however, this difference may reflect some displaced workers leaving the state or taking jobs in sectors not covered by UI. Such workers have zero reported earnings, but may, in fact, have income not covered in the PA UI system. To mitigate such concerns, the last three columns of Table I show results limited to workers who had positive reported earnings in each calendar year from 1980 to 1986, a restriction JLS imposed in their empirical analysis. Differences between displaced and nondisplaced workers in the period 1974–1979 are little affected by this restriction. However, the earnings differential for the period 1987–1991 is narrowed considerably, though it remains quite large. Figure I displays estimates of the percentage difference in annual earnings relative to the base period and to the comparison group of workers remaining at their employer from 1980 to 1986
Fraction eastern PA
Fraction other manufacturing
Fraction other durable goods manufacturing (nonsteel)
Fraction steel industries
1979 firm’s employment
Number of quarters in nonemployment 1974–1979
Percent change in quarterly earnings 1974–1979
Log(std. dev. of log quarterly earnings 1974–1979)
Log(average quarterly earnings in 1974–1979)
Sample size Age in 1979
Work restriction in Pennsylvania labor market during 1980–1986
21,573 30.42 (7.124) 8.74 (0.358) −1.637 (0.732) 0.513 (5.736) 0.48 (0.977) 8,556 (13,944) 0.179 (0.384) 0.297 (0.457) 0.191 (0.393) 0.562 (0.496)
All workers (1) 7,256 30.14 (7.422) 8.70 (0.346) −1.483 (0.767) 0.677 (7.699) 0.58 (1.100) 10,483 (16,287) 0.292 (0.455) 0.349 (0.477) 0.164 (0.370) 0.475 (0.499)
Displaced workers (2) 14,317 30.55 (6.964) 8.76 (0.362) −1.715 (0.700) 0.430 (4.425) 0.43 (0.904) 7,579 (12,479) 0.122 (0.328) 0.271 (0.444) 0.204 (0.403) 0.606 (0.489)
Nondisplaced workers (3)
No work restriction
TABLE I SAMPLE CHARACTERISTICS BY DISPLACEMENT STATUS
17,641 37.42 (7.031) 8.75 (0.345) −1.680 (0.709) 0.459 (5.343) 0.45 (0.919) 8,087 (13,267) 0.163 (0.370) 0.300 (0.458) 0.200 (0.400) 0.581 (0.493)
All workers (4) 4,785 37.01 (7.295) 8.70 (0.338) −1.545 (0.749) 0.582 (7.287) 0.54 (1.029) 9,065 (15,018) 0.260 (0.438) 0.365 (0.481) 0.183 (0.387) 0.521 (0.500)
Displaced workers (5)
12,856 37.57 (6.925) 8.76 (0.346) −1.731 (0.687) 0.413 (4.410) 0.42 (0.873) 7,723 (12,534) 0.128 (0.334) 0.275 (0.447) 0.206 (0.405) 0.603 (0.489)
Nondisplaced workers (6)
Work every year
JOB DISPLACEMENT AND MORTALITY
1273
8.606 (1.069) −1.344 (0.764) 4.31 (7.070) 6.764 (0.143) 4.167 (0.181) 7.407 (0.227) 10.815 (0.427)
All workers (1) 8.184 (1.310) −1.119 (0.793) 6.66 (8.207) 7.639 (0.263) 5.151 (0.347) 8.114 (0.411) 11.909 (0.777)
Displaced workers (2) 8.791 (0.883) −1.440 (0.730) 3.11 (6.079) 6.325 (0.170) 3.670 (0.208) 7.053 (0.272) 10.270 (0.510)
Nondisplaced workers (3)
No work restriction
8.728 (0.891) −1.393 (0.736) 2.20 (4.736) 6.343 (0.152) 3.745 (0.189) 6.994 (0.242) 10.347 (0.458)
All workers (4) 8.421 (1.064) −1.197 (0.757) 3.32 (5.900) 6.913 (0.306) 4.400 (0.393) 7.451 (0.481) 11.033 (0.911)
Displaced workers (5)
8.838 (0.792) −1.462 (0.716) 1.79 (4.145) 6.132 (0.175) 3.502 (0.214) 6.826 (0.280) 10.094 (0.529)
Nondisplaced workers (6)
Work every year
Notes. Standard deviations in parentheses (with exception for death rates, which show standard errors). The samples include only male workers born 1930–1959 in stable employment 1974–1979 at an employer of size fifty in 1979. Displaced workers left jobs in firms whose employment the subsequent year was 30% or more below its post-1974 peak. Information pertaining to employment and earnings is from Pennsylvania. Deaths can occur anywhere in the United States.
Deaths per 1,000 per year 2000–2006
Deaths per 1,000 per year 1994–1999
Deaths per 1,000 per year 1987–1993
Deaths per 1,000 per year 1987–2006
Number of quarters in nonemployment in 1987–1991
Log(std. dev. of log quarterly earnings in 1987–1991)
Log(average quarterly earnings in 1987–1991)
Work restriction in Pennsylvania labor market during 1980–1986
TABLE I (CONTINUED)
1274 QUARTERLY JOURNAL OF ECONOMICS
JOB DISPLACEMENT AND MORTALITY
1275
FIGURE I Estimate of the Decline in Annual Earnings due to Job Displacement (Sample of Men in Stable Employment 1974–1979, Firm 1979 Employment ≥50, Born 1930–1959, Work in PA Labor Force Every Year 1980–1986) Solid line represents coefficient estimates of the interaction of year effects and displacement dummies in a regression model of log quarterly earnings including year fixed effects, person fixed effects, and a quartic for age. Two standard error bands are drawn around main effects.
controlling for year, age, and worker fixed effects. As in the last three columns of Table I, the model is estimated using a sample that is restricted to workers who had positive earnings every year from 1980 to 1986. In the year immediately after displacement, earnings are over 50 log points below levels expected in the absence of displacement. Losses decline over time, but even eleven years after displacement they are approximately 15%. Clearly, displacement is a major economic setback for the affected workers. We have also analyzed in a similar manner the impact of displacement on several other career outcomes, finding that displacement leads to modest long-run increases in earnings variability and the likelihood of changing jobs or industries. The effect of displacement on other outcomes—such as incidence of nonemployment, industry mobility, or mobility across counties—is significant in the first two to four years after layoffs but not afterward.15 15. See Table 9 of our longer working paper for detailed results (Sullivan and von Wachter 2007).
1276
QUARTERLY JOURNAL OF ECONOMICS
The last several rows of Table I show mortality rates over a number of time periods (deaths can occur anywhere in the United States). As explained above, to avoid misclassifying dying nonseparators as displaced, we drop workers dying in the year of displacement. Not surprisingly, rates for all workers rise over time, from about four per thousand between 1987 and 1993 to more than ten per thousand between 2000 and 2006. The table also shows that displaced workers experienced higher mortality rates than those who were not displaced. The gap between the groups’ mortality rates was especially high in the period 1987–1993, shortly after displaced workers lost their jobs. Indeed, during this period, displaced workers were more than 40% more likely to die as nondisplaced workers (5.151 per 1,000 versus 3.670 per 1,000). However, even twenty years later, during the period 2000–2006, mortality rates were more than 15% higher for the displaced workers. Of course, these simple comparisons of mortality rates do not control for the systematic differences between displaced and nondisplaced workers that are illustrated in the upper portions of the table.16 II.B. Main Estimation Strategy To control for differences in other variables that may affect mortality, we employ a standard logistic regression framework. Specifically, we estimate a number of logistic regression models of the form pit = xi β + δ Dit + χa(i,t) + φt , (1) ln 1 − pit where pit ≡ Pr{Deathit = 1|Deathit−1 = 0} is the hazard of worker i dying in year t given survival through year t − 1, and Dit is a dummy variable equal to one if worker i has been displaced prior to year t and zero otherwise.17 Thus, the coefficient on the indicator variable for displacement measures the increase in the log odds of death in a given year, holding constant the other variables in the model. Because the probability of death is typically 16. Note that we also replicated standard estimates of the age-gradient in mortality for our sample (Appendix, Figure 1, Sullivan and von Wachter [2007]) and found them to be quite similar to typical patterns for representative U.S. samples. 17. This is a standard logistic regression model, and we obtain our parameter estimates by maximum likelihood. Workers contribute one observation for each year that they are alive during the follow-up period. The risk set evolves over time as workers die. Efron (1988) shows that the logistic model we estimate approximates standard continuous parametric models of the survival hazard.
JOB DISPLACEMENT AND MORTALITY
1277
quite small, the increase in the log-odds ratio approximates the percentage increase in the death rate itself. In some models, we also include interactions of the displacement dummy with other variables, which allows the effect of displacement on mortality to vary in a number of important ways. All the specifications we report below include year dummies (φt ), which among other things may control for variation over time in the completeness of the SSA’s death records. They also include a fourth-order polynomial in age (χa(i,t) ). Results are very similar if the age quartic is replaced by an unrestricted set of age dummies, or even a simple linear time trend. None of our results are sensitive to the logistic functional form; they are all evident in straightforward tabulations of average mortality rates and in linear probability models. The firm-level shocks that lead to employment reductions should be exogenous to workers’ individual health problems. However, it is possible that firms faced with the need to reduce employment may tend to lay off their least productive workers, who may in turn be in poor health. To address this potential problem, we consider a number of specifications that control for variables likely to capture productivity differences in the period 1974–1979 (xi ). In Section III.D, we summarize several additional robustness checks confirming that our results are not affected by selective job displacement. III. DISPLACEMENT, MORTALITY, AND LIFE EXPECTANCY This section presents our basic estimates of the effect of displacement on the mortality hazard. We first show results based on models that assume a constant effect on the hazard. We then show how the effect varies with time since displacement and other variables. Finally, we derive the implications of our estimates for life expectancy and summarize our sensitivity analysis. III.A. Displacement and the Mortality Hazard The first column of Table II shows estimates of the coefficient on the displacement dummy of model (1) for various sets of control variables (xi ). Models are estimated using the full sample of workers over the entire period 1980–2006. Controlling only for the mean and standard deviation of earnings during the period 1974–1979 as shown in row (1), we estimate that displacement is associated with about a 17% increase in the mortality hazard. The
Model in row (1) with one-digit industry effects and added career variables Model in row (1) with industry effects and career variables*age interactions Linear probability model (specification row (2))
Linear probability model (specification row (1)) with firm effects
(3)
(6)
(5)
(4)
(2)
Baseline model with average and std. dev. of earnings in 1974–1979 Model in row (1) with one-digit industry fixed effects
(1)
Death follow-up period 0.170 (0.036) 0.170 (0.037) 0.163 (0.038) 0.169 (0.037) 0.0012 (0.00026) 0.0013 (0.00038)
No work restriction (1) 1980–2006 0.147 (0.037) 0.137 (0.038) 0.129 (0.039) 0.136 (0.038) 0.0011 (0.00032) 0.0008 (0.00050)
No work restriction (2) 1987–2006 0.148 (0.038) 0.139 (0.039) 0.128 (0.040) 0.138 (0.039) 0.0012 (0.00031) 0.0010 (0.00048)
Work at least three years (3) 1987–2006 0.088 (0.044) 0.077 (0.045) 0.069 (0.047) 0.077 (0.045) 0.0006 (0.00034) 0.0006 (0.00054)
Work every year (4) 1987–2006
0.104 (0.046) 0.098 (0.047) 0.088 (0.048) 0.098 (0.047) 0.0008 (0.00034) 0.0009 (0.00051)
Work every year, exclude non-MLF separators (JLS sample) (5) 1987–2006
TABLE II EFFECT OF JOB DISPLACEMENT ON LOG-ODDS OF DEATH FOR VARIOUS SAMPLES, FOLLOW-UP PERIODS, AND SPECIFICATIONS (WORKERS IN STABLE EMPLOYMENT 1974–1979, FIRM 1979 EMPLOYMENT ≥50, BORN 1930–1959)
1278 QUARTERLY JOURNAL OF ECONOMICS
0.194 (0.041) 0.200 (0.059) 553,167
0.140 (0.027) 0.104 (0.042) 402,844
0.143 (0.041) 0.120 (0.063) 392,536
0.081 (0.029) 0.075 (0.046) 334,598
0.103 (0.042) 0.117 (0.063) 291,373
Notes. The dependent variable is the log odds of death in a year between 1980 (or 1987) and 2006. Deaths can occur anywhere in the United States. The entries in the table are the coefficients on a dummy for job loss during mass layoff. See Sullivan and von Wachter (2007) for marginal effects. Columns represent different samples; rows represent different model specifications. All models include year effects and a quartic in age as well as the indicated variables in the first column. In all models, the average of quarterly earnings 1974–1979 is entered in logs; the standard deviation is of the log quarterly earnings, also entered in logs. Industry dummies are for nonmanufacturing goods, nondurables manufacturing, other durables manufacturing, steel manufacturing, transportation–construction–public-utilities, trade, and services. The additional “career variables” in rows (3) to (6) are growth in quarterly earnings during 1974–1979 and the total time spent in non-employment in 1974–1979. Row (4) interacts the log of average earnings and the log of the standard deviation of log earnings with five dummies for age at layoff. The firm in row (6) refers to the 1979 employer. The last row shows the number of person–year observations. Column (5) corresponds to the sample used in JLS (1993), which excludes nonmass layoff separators from the control group. Standard errors are in parentheses (for rows (7) and (8), these are calculated by the delta-method).
Observations
(8) Percentage effect for linear probabiliy model in row (6)
(7) Percentage effect for linear probability model in row (5)
Death follow-up period
Work every yeasr, exclude non-MLF No work No work Work at least Work every separators restriction restriction three years year (JLS sample) (1) (2) (3) (4) (5) 1980–2006 1987–2006 1987–2006 1987–2006 1987–2006
TABLE II (CONTINUED) JOB DISPLACEMENT AND MORTALITY
1279
1280
QUARTERLY JOURNAL OF ECONOMICS
remaining rows probe the robustness of this result. Adding 1-digit industry fixed effects, the growth in earnings and the number of quarters of zero earnings during the base period, and interactions of the career variables with age as shown in rows (2)–(4) has very little effect on the estimate.18 Row (5) shows the estimate for a linear probability model version of row (2), whereas row (6) shows results from a linear probability model that includes firm fixed effects. When expressed as percentages of the baseline hazard, these latter two estimates are modestly higher than but in the same ballpark as the estimates from the logit models. Overall, there is no indication that our effects can be explained by firms selectively displacing less productive workers who are also less healthy than their peers. Similarly, it does not appear that firms or sectors with high average layoff rates provide less healthy career environments or attract less healthy workers.19 The remaining columns of Table II show the impact of changing the data set over which model (1) is estimated. In column (2) we continue to use the full set of workers, but restrict the time period over which we track mortality to 1987–2006. Restricting the time period in this way lowers the estimates to the 10%–15% range, which is consistent with the biggest effects being observed immediately after displacement. In column (3), we restrict the set of workers to those who have positive reported earnings in at least three years between 1980 and 1986. This has very little effect on the estimates. However, requiring workers to have earnings in all years, as shown in column (4), lowers the estimates to the 7%–9% range. This suggests that part of the effect estimated on our more general sample in column (1) is due to workers permanently dropping out of the labor force or leaving PA. However, the majority of the effect is still present for workers with stable attachment to the PA labor force after job loss. Finally, column (5) shows results using the original JLS sample, which, in addition to requiring earnings in each year from 1980 to 1986, drops nonmass-layoff separators. These estimates again range from 9% to 18. If we include only year and age effects for our most general sample in column (1), we obtain a displacement effect of 0.227 (0.0354); if we include only log average earnings as an additional control variable, we obtain 0.2005 (0.0355). 19. The fact that the within-firm estimate of the effect of displacement on death is not smaller suggests that workers remaining at the firm experiencing mass layoff do not have higher mortality that may have arisen, say, due to increased uncertainty. This is consistent with our finding that mortality increases are correlated with large earnings losses of displaced workers.
JOB DISPLACEMENT AND MORTALITY
1281
11%.20 Overall, we consider the estimates shown in Table II as indicating a reasonable degree of robustness to the set of additional control variables and the sample of workers included in the estimation. Table III displays the other coefficients in the models of row (3) of Table II, which, as we discuss in Section IV, are useful for trying to understand the channels through which displacement affects mortality. The elasticity of the mortality hazard with respect to average quarterly earnings in 1974 to 1979 is about −0.5.21 The elasticity of mortality with respect to the standard deviation of the logarithm of quarterly earnings is estimated to be around 0.17, indicating that higher earnings variability tends to increase mortality. Holding average earnings and earnings variability constant, an additional quarter of nonemployment due to sick leave or temporary layoffs in the base period reduces mortality by about 9%, an effect that is, perhaps, consistent with the findings of Ruhm (2000). Conditional on the other variables, the earnings growth trend from 1974 to 1979 has little effect on mortality. III.B. Mortality Effects by Year since Layoff, Age, and Job Tenure The results in Table II suggest that the immediate impact of displacement on mortality differs from the long-run effect. To explore this pattern further, column (1) in Table IV breaks up the effect of displacement on death by year since layoff for our most general sample (as in column (1) in Table II). Row (1) shows the long-run effect in manufacturing industries. The effect is statistically significantly different from zero and substantial even at 16 or more years after layoff. The last row of the table shows that this effect is not statistically significantly different in nonmanufacturing industries. The remaining rows of column (1) show how the effect differs in the first 15 years after layoff. To obtain the full effect of 20. This suggests that non-mass-layoff separators experience mortality increases as well, which is confirmed in Appendix Table 3 of our longer working paper (Sullivan and von Wachter 2007); this is not surprising because for highattachment workers in the difficult economic environment in the early to mid1980s, most job separations tended to lead to nontrivial earnings losses (see also von Wachter, Song, and Manchester [2009]). 21. This estimate is somewhat higher than typical estimates of the correlation of mortality with a single year of income (e.g., Deaton and Paxson [1999]). This is because our data on average earnings over a six-year period do a better job capturing a notion of permanent income and are less affected by measurement error present in self-reported income measures in survey data. This is further discussed in Sullivan and von Wachter (2009) and in Appendix 1 of our longer working paper.
0.163 (0.038) −0.504 (0.055) 0.172 (0.027) −0.090 (0.025) −0.002 (0.052) Yes 505,316
No work restriction (1) 1980–2006 0.129 (0.040) −0.516 (0.057) 0.163 (0.028) −0.090 (0.026) 0.008 (0.054) Yes 367,890
No work restriction (2) 1987–2006 0.128 (0.040) −0.499 (0.058) 0.170 (0.028) −0.087 (0.026) 0.016 (0.055) Yes 358,660
Work at least three years (3) 1987–2006 0.069 (0.047) −0.472 (0.066) 0.174 (0.032) −0.095 (0.031) 0.015 (0.062) Yes 308,345
Work every year (4) 1987–2006
0.069 (0.047) −0.472 (0.066) 0.174 (0.032) −0.095 (0.031) 0.015 (0.062) Yes 308,345
Work every year, exclude non-MLF separators (JLS sample) (5) 1987–2006
Notes. These are coefficients on covariates included in model (3) of Table II. Please refer to notes to Table II for further explanations. Standard errors are in parentheses.
1-digit dummies for 1979 industry Observations
Growth in quarterly earnings 1974–1979
Number of quarters in nonemployment 1974–1979
Log(std. dev. of log quarterly earnings 1974–1979)
Log(average quarterly earnings 1974–1979)
Displacement dummy
Death follow-up period
Work restriction in Pennsylvania labor market during 1980–1986
TABLE III COEFFICIENTS ON CAREER VARIABLES IN EXTENDED LOG-ODDS OF DEATH MODEL (VARIOUS SAMPLES, WORKERS IN STABLE EMPLOYMENT 1974–1979, FIRM 1979 EMPLOYMENT ≥50, BORN 1930–1959)
1282 QUARTERLY JOURNAL OF ECONOMICS
0.131 (0.054) 0.716 (0.199) 0.559 (0.147) 0.198 (0.147) 0.057 (0.094) −0.066 (0.081)
Displacement effect 16+ years after displacement
Displacement and current age between 46 and 55
Displacement and current age less than or equal to 45
Added effect for 11–15 years after displacement year
Added effect for 6–10 years after displacement year
Added effect for 4–5 years after displacement year
Added effect for 2–3 years after displacement year
Added effect for 1 year after displacement year
1930–1959 (1) 0.108 (0.034) 0.619 (0.105) 0.307 (0.084) 0.040 (0.082) 0.045 (0.054) −0.045 (0.047)
1920–1959 (2) 0.133 (0.055) 0.582 (0.113) 0.279 (0.091) 0.020 (0.086) 0.036 (0.057) −0.046 (0.048) 0.383 (0.131) 0.136 (0.075)
1920–1959 (3)
Tenure in 1979 at least six years
Birth cohort
Restriction on job tenure
0.161 (0.05) 0.782 (0.176) 0.525 (0.136) 0.204 (0.135) 0.027 (0.087) −0.053 (0.073)
1930–1959 (4)
0.123 (0.032) 0.606 (0.099) 0.318 (0.078) 0.033 (0.077) 0.053 (0.051) −0.044 (0.044)
1920–1959 (5)
0.152 (0.05) 0.585 (0.106) 0.303 (0.084) 0.024 (0.081) 0.051 (0.053) −0.042 (0.045) 0.220 (0.116) 0.117 (0.066)
1920–1959 (6)
Tenure in 1979 at least three years
TABLE IV MORTALITY IMPACT OF JOB DISPLACEMENT BY TIME SINCE DISPLACEMENT, AGE-GROUP, INDUSTRY, AND TENURE AT JOB LOSS FOR DIFFERENT SAMPLES (WORKERS IN STABLE EMPLOYMENT 1974–1979, FIRM 1979 EMPLOYMENT ≥50, NO FURTHER PRESENCE RESTRICTION IN PA LABOR MARKET) JOB DISPLACEMENT AND MORTALITY
1283
0.045 (0.084)
1930–1959 (1)
−0.065 (0.051)
1920–1959 (2)
1930–1959 (4)
−0.040 (0.074)
−0.003 (0.054) −0.092 (0.041) −0.070 (0.051)
−0.059 (0.047)
1920–1959 (5)
−0.009 (0.050) −0.099 (0.039) −0.066 (0.047)
1920–1959 (6)
Tenure in 1979 at least three years
1920–1959 (3)
Tenure in 1979 at least six years
Notes. Samples are workers born 1920–1959 in stable jobs from 1974–1979 with an employer of over 50 workers. Dependent variable is the log odds of death. Death can occur anywhere in the United States. Entries are coefficient estimates from the logit model. All models include year fixed effects, industry fixed effects, a quartic in age, the log of average quarterly earnings in 1974–1979, and the log of the standard deviation of quarterly earnings in 1974–1979. The coefficient in the first row is the main effect; the excluded categories are “16 or more years after displacement” and “displaced from manufacturing job.” In addition, in columns (2), (3), (5), and (6), the excluded group is “displacement and current age between 56 and 64.” Standard errors are in parentheses.
Displaced from nonmanufacturing job
Displacement at age 60–69
Displacement and current age above age 65
Birth cohort
Restriction on job tenure
TABLE IV (CONTINUED)
1284 QUARTERLY JOURNAL OF ECONOMICS
JOB DISPLACEMENT AND MORTALITY
1285
FIGURE II The Effect of Displacement on Log-Odds of Death by Years since Displacement (Sample of Men in Stable Employment 1974–1979, Firm 1979 Employment ≥50, No Further Presence Restriction in PA Labor Market) (A) Effect by years since displacement for workers born 1930–1959 (including two standard error bands). Solid line represents coefficients of log-odds model of mortality on years since displacement and basic other control variables. These are the main effects corresponding to column (1), Table IV. Dashed lines represent twostandard-errors bands. (B) Simulated effect of displacement by current age and age at displacement for workers born 1920–1959. The lines represent coefficients from a log-odds model of death on four dummies for current age interacted with displacement, to which dummies for years since displacement were added, as well as a dummy for whether age at displacement was sixty or greater. Coefficients are taken from column (3), Table IV. See text for details.
displacement on mortality at different years since displacement, the coefficients on the interactions have to be added to the main effect in row (1).22 We see large percentage increases immediately after job displacement. The effect remains high for the first five years after job loss, then gradually declines with time since layoff, and bottoms out at a long-run average of about 13%. This is shown graphically in Panel A of Figure II, which plots the point 22. For example, for a displaced worker two to three years after layoff, the effect of displacement on mortality would be 0.131 + 0.559 = 0.69.
1286
QUARTERLY JOURNAL OF ECONOMICS
estimates and two-standard-error bands. These estimates suggest strong immediate responses when the impact of a layoff on earnings, employment, job mobility, and other career outcomes is most severe. The effects then stabilize at a permanent difference as workers continue to suffer negative consequences of layoffs in terms of reduced earnings. As in Tables II and III, column (1) of Table IV includes workers born after 1930 that we observe up to age 76. To further study the long-run effect of displacement on death, column (2) shows the same estimates when we include workers whom we observe closer to the end of their lives (up to age 86).23 The short- and long-run effects of job loss on the long-term mortality rate are similar but somewhat smaller. This suggests that the proportional effect of displacement on the mortality rate varies by age. Thus, column (3) includes interactions of displacement with age groups. The excluded age group is 56–64. The specification also includes a dummy for whether workers were displaced near retirement age (ages 60–69). We find that workers younger than age 55 suffer significantly higher percentage increases in mortality hazards in response to a displacement than older workers. This difference is particularly strong for workers under 45 but still present for workers aged 46 to 54. We also find that workers displaced near retirement age appear to respond significantly less to job loss than workers displaced in middle age. This is perhaps not surprising, because older workers are more likely to have access to Social Security benefits, to company pension plans, or to Medicare. Even for workers not yet at retirement age, access to federal disability insurance increases substantially at age 55, when workers can claim loss of vocational qualifications to qualify for disability insurance (e.g., Chen and Van der Klaauw [2008]; Black, Daniel, and Sanders [2002]). Younger and middle-aged workers do not have access to similar mechanisms to smooth long-term earnings losses. Moreover, as further discussed in Section IV, for these workers the reduction in lifetime earnings is larger because earnings losses accrue over a longer period.24 23. For men of the birth cohorts in our samples at ages forty to fifty, average life expectancy is about 70–75 (National Center for Health Statistics 2006, Table 11). 24. Another possibility is that more frail displaced workers die first, reducing the gap in mortality rates between displaced workers and nondisplaced workers. Such dynamic selectivity would lead us to understate the effect of displacement on mortality as workers age. However, because the number of deaths is small relative to the overall population at risk of death, the average underlying health of the
JOB DISPLACEMENT AND MORTALITY
1287
As workers age, the total effect of displacement on mortality is determined by the sum of the long-run mortality effect in row (1) and the coefficients of the relevant interactions of displacement with year since displacement, current age, and age at layoff; for example, for a worker under 45 and two to three years after displacement, the effect of displacement on mortality is 0.795 (=0.133 + 0.279 + 0.383). The resulting total effects of displacement on mortality by current age for different ages of separation are shown in Panel B of Figure II. The entries in the figure are obtained by summing up the relevant coefficients in column (3) of Table IV.25 In the first years after layoff, workers at all ages experience large increases in the probability of death in the range of 50%–100%. The effect declines as workers age and settles at a positive long-run effect sixteen years after layoff for all age groups (the effect in row (1) of Table IV). Only for workers displaced after age sixty is the long-run effect close to zero.26 To assess whether our results are robust to restricting our sample to those with at least six years tenure, columns (4) to (6) of Table IV replicate the estimates for a sample of workers with at least three years of tenure. Column (4) shows that the results are now larger both in the long and the short run. This is not surprising, because the lower tenure requirement tends to draw younger workers into the sample of displaced workers, and we find that younger workers tend to show a larger mortality effect. Once we consider a sample including older workers (column (5)) and control for age (column (6)), the results are quite similar across tenure restrictions. Thus, reducing the tenure requirement to at least three years of job tenure at displacement does not substantially affect our results, especially once we account for differences in the effect of job displacement on mortality by age.
population at risk is unlikely to be greatly affected by selection. Thus, the effect of dynamic selection is likely to be small in the present case. 25. We obtain similar patterns with complete interactions of age-atdisplacement and years-since-layoff; however, with the increased number of parameters, the estimates become imprecise. 26. The figure also shows that for workers aged fifty and older at displacement, the effect actually increases somewhat 15 years after job loss, amounting to a U-shape, albeit a weak one. A similar pattern has been observed for the event of losing a spouse, which leads to an initial increase in mortality from stress and a weak long-term rise of the mortality rate to the level of single individuals (e.g., Martikainen and Valkonen [1996]). The pattern is suggestive of an initial response due to acute stress caused by the job loss, followed by a long-term cumulative impact of increased chronic stress due to lower earnings.
1288
QUARTERLY JOURNAL OF ECONOMICS
III.C. The Reduction in Life Expectancy because of Job Loss Because increased mortality affects workers over a long time horizon, our estimates imply substantial reductions in life expectancy. The average loss in life expectancy can be used as a summary measure of the cost of job loss at mass layoff in terms of life-years, much as the present discounted value of lifetime earnings losses provide summary measures in conventional analyses of the cost of job loss. In Table V, we present a range of estimates of losses in life expectancies for alternative samples and ages. Because some cohorts are still alive at the end of the sample period, to calculate the total cumulated effect of permanently greater mortality hazards we have to make an assumption about the development of mortality differences between laid-off workers and the control group past our observation window.27 Specifically, we assume that the proportional increase in the odds of death that we estimate for the highest observed age is maintained indefinitely. Given that the typical profile of the increase in the log odds of dying is stable through older age ranges, and remains so within the groups of our sample, this is a plausible assumption.28 All life expectancies are based on our most general sample with no further employment restriction, include older workers, and are calculated for workers with average annual earnings in 1974–1979 working in nonmanufacturing industries in 1979. Because there are large increases in mortality right after layoff, Table V is based on estimates of survival curves that take into account the dynamic response in mortality found in Table IV. Because the effect of displacement status also differs by age, we use our most general specification in Table IV to calculate life expectancy. The last column of Table V shows that losses in life expectancy are larger for workers losing their jobs in their thirties and forties. Life expectancy of workers displaced in their 27. Life expectancy can be calculated as the sum of survivor probabilities over the remaining potential age-years of an individual. The difference in life expectancies then is the sum of the differences in survivor probabilities. We experimented with different windows of extrapolation and found that our results are not driven by differences in extrapolated survival probabilities outside our sample period. 28. Because extrapolation of a quartic polynomial in age can be unstable, to calculate life expectancies we worked with a linear age specification for calculating differences in life expectancy. We find that the log-odds ratio is well approximated by a linear function in age. Efron (1988) shows that with a linear age-component, our logistic model would imply a Gompertz-distribution for lifetime in a continuous time setting, a distribution commonly used in the analysis of survival times. We have also experimented with a quartic polynomial and found that it did not significantly alter our results.
Years since displacement categories Current age categories Displaced age ≥60 Nonmanufacturing
(2) Stable job 1974–1979; no restrictions on earnings 1980–1986; 1920–1959 birth year; tenure in 1979 at least three years
Life expectancy given not displaced 76.45 76.56 76.73 76.99 77.37 77.92 76.56 76.67 76.85 77.11 77.49 78.05
Age at separation 30 35 40 45 50 55 30 35 40 45 50 55
75.10 75.29 75.58 76.00 76.62
74.99 75.22 75.58 76.01 76.64 74.97
74.85
Life expectancy given displaced
−1.57 −1.56 −1.53 −1.50 −1.43
−1.56 −1.51 −1.41 −1.36 −1.29 −1.59
−1.59
Lost years of life due to displacement
Notes. All models include log of mean earnings, log of standard deviation of log quarterly earnings, one-digit industry dummies, and a linear age effect. The rows labeled (1) and (2) correspond to models equivalent to columns (3) and (6) of Table IV, respectively. The numbers are based on a linear extrapolation in age for cohorts still alive. See text for further information.
Years since displacement categories Current age categories Displaced age ≥60 Nonmanufacturing
(1) Stable job 1974–1979; no restrictions on earnings 1980–1986; 1920–1959 birth year; tenure in 1979 at least six years
Sample
Displacement interactions included
TABLE V IMPACT OF JOB DISPLACEMENT ON LIFE EXPECTANCY BY AGE AT SEPARATION AND JOB TENURE
JOB DISPLACEMENT AND MORTALITY
1289
1290
QUARTERLY JOURNAL OF ECONOMICS
fifties still declines, but by less. This difference arises both because younger workers are exposed to higher mortality rates over a longer period of time and because the increase in mortality rates tends to be greater for younger workers.29 In our longer working paper (Sullivan and von Wachter 2007), we confirm similar orders of magnitude for samples restricted to workers who have some employment in the period of job loss. Given these substantial declines in life expectancy, an important question is how these losses should be treated with respect to more conventional estimates of the cost of job loss. The typical measure of the cost of displacements is based on the loss in present discounted value of earnings. For the average worker in our hightenured sample, this amounts to about $200,000 at a real interest rate of 4%. To make the losses in life expectancy comparable, we can monetize them by choosing an estimate of the statistical value of life. At about five million dollars per statistical life, a loss of one and one-half years would amount to a monetary loss of $100,000. Although they can be no more than broadly indicative, values of this order of magnitude imply that the cost of job loss for displaced workers may be substantially underestimated by traditional summary measures such as the loss in lifetime earnings.30 III.D. Sensitivity Analysis: Pooling Displaced and Nondisplaced Workers We have shown that our estimates are robust to the inclusion of an extensive set of alternative control variables as well as industry and firm fixed effects. This suggests that any remaining bias from selective displacement or from sorting of workers into firms is likely to be small, especially in the environment of 29. Because life expectancy is the sum of the survivor probabilities over the remaining potential lifetime, summing up reduced survivor probabilities over a longer period reduces predicted life expectancy. The effect of displacement on survivor probabilities varies with age because of the functional form of the logistic function itself (as discussed in Sullivan and von Wachter [2007, Figure 3]), as well as from explicit age-displacement interactions as included in Table IV. 30. The loss in the present discounted value (PDV) of earnings is a sufficient statistic for the cost of job loss only if long-term health is exclusively an outcome of optimally chosen inputs given a lifetime budget constraint. In this case, the observed reduction of lifetime would represent the optimal response to the decline in resources following a job loss. However, only a small fraction of health expenditures are out of pocket. Moreover, health is likely to be affected by factors other than consumption or health inputs that are directly affected by job loss, such as social status. In that case, to obtain the total costs of job displacements, at least part of the monetary value of the direct effect of layoffs on mortality should be added to the PDV of earnings losses.
JOB DISPLACEMENT AND MORTALITY
1291
PA in the early to mid-1980s, when employment reductions were often severe. Moreover, in this period, most firms in the sectors most prominent in our sample either were unionized or followed seniority rules in dismissals (Abraham and Medoff 1984). Both of these factors are likely to have reduced employers’ ability to selectively displace workers. To address the question of a remaining bias from selective displacement directly, in this section we present estimates that pool displaced and nondisplaced workers. These are based on a specification similar to equation (1), but with the variable of interest defined at the firm level, rather than the worker level. Because the mass-layoff dummy now varies only at the firm level, the estimates compare the change over time in mortality of all workers present at a firm experiencing a mass layoff with that of similar workers at stable firms. By construction, these “intent-to-treat” estimates are not affected by selection at the time of job loss or by misclassification of dying workers as job losers.31 The results shown in Table VI are consistent with the findings of our main analysis. Working at the firm level considerably reduces our degrees of freedom, so to maximize precision, samples include workers born after 1920 and the specification includes fewer interactions. In the first two columns, mass layoff is defined as in our main estimates—employment falling 30% below its previous peak. In many cases, however, employment declines are gradual, so firms may be laying off workers before they reach the 30% threshold. Thus, in the third and fourth columns, we present a specification in which mass layoff is defined as a sudden large drop in employment at the firm.32 To compare the magnitudes of the estimates in Table VI to those in earlier tables, columns (2) and (4) show estimates divided by 0.3, which is approximately the effect of mass layoff on 31. The estimates may still be affected by sorting of workers into firms prior to layoff. In our main analysis, we showed that the results are unaffected by the inclusion of firm fixed effects, indicating that sorting plays no major role in our sample. Here, we cannot include firm fixed effects without losing precision. Yet the results are robust to the inclusion of industry effects and of controls for characteristics of employers (average wage and average employment size from 1974 to 1979 of a worker’s 1979 employer). 32. Our results confirm that the timing of the shock at the firm level appears better captured by a sudden drop in employment. In the case of more gradual employment reductions, it is generally difficult to assign the year of a distinct shock occurring at the firm level. This is less of a problem at the individual level in our main estimates, because an individual’s job separation always constitutes a distinct treatment.
Panel A: Average effect of mass layoff at 1979 employer on hazard of death 0.066 0.221 0.042 (0.027) (0.023)
Notes. Entries are coefficients on firm-level mass-layoff dummy (Panel A) and its interactions with year since layoff (Panel B) in a logit model of the event of dying in a given year. All models include a quartic in age, year effects, the log of average quarterly earnings in 1974–1979, the log of the standard deviation of log quarterly earnings in 1974–1979, and the average of the 1979 employer’s employment and quarterly earnings from 1974 to 1979 as control variables. In addition, the models include six dummies for 1979 industry. Columns (2) and (4) divide the coefficients in columns (1) and (3) by 0.3, the effect of mass layoff at the firm level on job separations. Standard errors clustered at the level of the 1979 employer are in parentheses.
0.158
0.092
0.916
0.139
Rescaled for comparison (4)
Coefficient estimate (3)
Rescaled for comparison (2)
Coefficient estimate (1)
Panel B: Dynamic effect of mass layoff at 1979 employer on hazard of death, 1980–2006 Effect in year of mass layoff 0.252 0.841 0.275 (0.087) (0.101) Effect 1–15 years after year of mass layoff 0.059 0.196 0.028 (0.036) (0.032) Effect 16+ years after year of mass layoff 0.029 0.095 0.047 (0.035) (0.029) Includes mean earnings and employment of 1979 employer Yes Yes Includes effects for 1979 industry Yes Yes
Average effect in 1980–2006
Mass layoff definition:
Year in which firm employment drops 30% relative to employment in year t−2
Year in which firm employment drops 30% relative to peak employment in 1974–1979
TABLE VI INTENT-TO-TREAT ESTIMATES OF THE EFFECT OF MASS-LAYOFF AT FIRM LEVEL ON MORTALITY POOLING MOVERS AND STAYERS (WORKERS IN STABLE EMPLOYMENT 1974–1979, FIRM 1979 EMPLOYMENT ≥50, BORN 1920–1959, NO FURTHER PRESENCE RESTRICTION IN PA LABOR MARKET)
1292 QUARTERLY JOURNAL OF ECONOMICS
JOB DISPLACEMENT AND MORTALITY
1293
job mobility.33 For either definition of firm-level mass layoff, the rescaled long-run effects of displacement on mortality shown in the bottom of Panel B in Table VI are of the same order of magnitude as the estimates shown in row (1) of Table IV.34 Table VI also suggests a large effect of displacement on mortality in the year of layoff. This suggests that dropping separators who die in the year of layoff in our main analysis may lead us to underestimate the full effect of displacement on mortality shown in Table II. Correspondingly, row (1) of Table VI shows that the average effect of displacement on the mortality hazard, properly rescaled, is somewhat larger. These findings suggest that our main results are unlikely to be driven by selective displacement of less healthy workers. Two further results contained in the Appendix and further discussed in our longer working paper indicate that firms with greater flexibility in selecting which workers to lay off did not systematically displace their least healthy workers. First, we show that the effect of displacement on mortality does not decline with the fraction of workers involved in a mass layoff. Second, we also find that other separators (those permanently leaving their long-term employers but not during a mass layoff) do not have higher mortality rates.35
IV. POTENTIAL CHANNELS OF MASS-LAYOFF EFFECT ON MORTALITY Our estimates of the short- and long-run effects of displacement on the mortality hazard in Figure II roughly parallel the short- and long-run effects of displacement on earnings shown in Figure I and reported in JLS and elsewhere. In the short run, displacement is associated not only with a sharp drop in mean earnings, but also with increased unemployment, job, region, and 33. If we estimate the same regression models with an individual dummy for the event of job displacement as dependent variable, the coefficient is 0.299 and 0.289 for the gradual and sudden drop definitions of mass layoff, respectively. This number also results from the fraction displaced shown in Appendix Table 1 of our longer working paper. 34. Note that the identification strategy differs for the two sets of estimates; in Table VI, the effect of the firm level event on separators and nonseparators gets added, whereas in the estimates in Table IV the estimates result from subtracting the effect on nonseparators from that of separators. 35. In an additional indication that selective displacement of less healthy workers mattered little in a similar context, Eliason and Storrie (2007) show that using detailed information on predisplacement occupation, health status, and demographics does not affect their estimates of the impact of job loss during establishment closures during 1986–1987 on health in Sweden (see the second and third sets of estimates in their Table VI).
1294
QUARTERLY JOURNAL OF ECONOMICS
industry mobility, as well as high earnings instability.36 Our results suggest these effects may lead to acute stress that substantially raises the mortality hazard. After this initially turbulent phase, in the medium to long run the majority of job losers settle into relatively stable employment at substantially lower mean earnings and modestly higher employment instability and earnings variability. Our empirical findings are consistent with a reduction in resources and increased instability leading to reduced investments in health and chronic stress that lead to a smaller, but longer term increase in the mortality hazard.37 To gain insight into the relative importance of some of the channels through which job loss could affect the long-run mortality hazard, we compare our estimates of the “reduced form” effect of displacement on mortality to what one would expect on the basis of displacement’s long-run effect on the mean and variability of workers’ earnings and the correlation of those factors with mortality shown in Table III. In our PA data, displacement reduces the mean of long-run earnings by 15%–20%. Taken at face value, the estimated −.5 correlation of average earnings with mortality would imply that we expect an increase in mortality of about 7.5%–10% for workers with high job attachment (0.15 or 0.2 times 0.5). Thus about 50%–75% of the long-run effect of mass layoff on mortality reported in Table IV, about 0.1 to 0.15, could be explained by the observed declines in average earnings. Similarly, in the medium to long run, the standard deviation of log quarterly earnings increases on average by about 16% after a mass layoff (results not shown). At a coefficient of −0.2 (Table III), this implies an increase in the probability of dying of about 3.2%. Although the order of magnitude of this effect is much lower than the potential impact of earnings, it could still account for about 20% of the mass-layoff effect, at least in the short run. In addition to these adverse effects, job loss may have beneficial effects on health. At least while not employed, displaced workers may be able to spend more time investing in their health 36. An extensive discussion of these results is contained in our longer working paper version (Table 9, Sullivan and von Wachter [2007]). 37. As noted previously, there are other potentially relevant channels to which our data do not speak directly, such as the loss of health insurance or the role of the family environment. Some of these channels may be associated with earnings losses—for example, if lower-paying jobs are less likely to provide health insurance—but may have independent effects. However, it is currently not possible to link information on health insurance, family status, or other worker or firm characteristics to our data.
JOB DISPLACEMENT AND MORTALITY
1295
and face reduced exposure to workplace and driving accidents.38 Although this channel may be present in our sample as well,39 on balance, the health of job losers we study did not benefit from short-term employment reduction. This may be because in the short run, when employment reduction is more frequent, job loss also involves very large earnings losses and a stressful adjustment period involving multiple job changes, including changes in industry or location. In the long run, the majority of workers we study did not benefit from reduced employment but instead suffered from continued employment at significantly lower earnings with a higher degree of uncertainty. Overall, these considerations suggest that the impact of displacement on the mean and variability of earnings may explain a large fraction of the increase in the long-run mortality hazard that we estimate. Of course, if frail people have large earnings losses, higher earnings instability, and higher death rates, the estimates underlying the above decomposition may not be causal parameters, and the predicted impact of each single mechanism is likely to be overstated.40 Nevertheless, among the channels we can measure, these calculations suggest that long-term earnings losses are likely to play a dominant role in explaining our results. IV.A. Evidence from Individual-Level Models To further explore the role of long-term earnings losses using the longitudinal data at our disposition, we also directly relate 38. Exploiting time-use data, Krueger and Mueller (2008) find some evidence that unemployed workers sleep more, spend more time purchasing goods and services (which includes obtaining medical services), and spend more time in leisurely activities (although the majority of the difference is explained by watching television). They do not find that the unemployed spend more time on personal care (which includes health-related self-care) or sports (their Table 3). They confirm findings from the previous literature that the unemployed are more unhappy or sad on the average (their Table 4). 39. In fact, we do find a beneficial effect of lower employment in 1974–1979 on mortality thereafter (Table III). However, spending time in nonemployment after displacement has no statistically significant effect on mortality (Table VII). Such potential beneficial effects of job loss on health are suggested by, among others, Ruhm (2000), who shows that mortality at the state level declines in recessions. We relate our results to Ruhm’s (2000) findings in the conclusion. 40. Reverse causation is less of a problem, because average earnings and earnings variability are measured in 1974–1979, whereas death is measured from 1987 to 2006 or from 1980 to 2006. In his study of Swedish lottery winners, Lindahl (2005, Appendix Table 2) shows that the effect of controlling for initial health conditions tends to reduce the correlation between mortality and earnings by about a third. Were this result to apply to our sample of high-attachment workers (who are likely of better health than the sample of older workers studied by Lindahl), the predicted role of earnings in explaining the mass-layoff effects is reduced by about a third.
1296
QUARTERLY JOURNAL OF ECONOMICS
the size of earnings losses at job displacement to the long-term increase in the hazard of death. This is shown in Figure III, Panel A. The figure shows the increase in mortality by deciles of long-term changes in mean earnings, controlling for age, year, and past average earnings.41 On average, those workers who have more substantial earnings losses also experience larger long-term increases in mortality hazard. We also directly included the long-term earnings change for both displaced workers and the comparison group as a control variable in our main logistic model. The results suggest a strong correlation between the change in average earnings and the longterm mortality hazard (model (1), Panel A, Table VII). Once we condition on earnings changes, the effect of the mass-layoff dummy becomes numerically small and insignificant. To directly assess the potential role of the increase in the variability of earnings at job loss, we also estimated models that included the change in the standard deviation of quarterly earnings from 1987 to 1991 as an additional control variable (model (2), Panel A, Table VII). As expected, an increase in the standard deviation has a positive significant effect on death rates, and the impact of earnings changes on long-term mortality declines somewhat. The results are robust if we divide the sample by degree of labor force attachment or displacement status. Estimates from the individual-level models appear to confirm the results of our approximations based on the estimates in Table III. However, we do not interpret these results as necessarily indicating causal channels running from earnings losses or earnings variability to mortality, because there may be omitted variables driving both earnings losses and mortality increases, such as differential increases in depression in response to layoffs. IV.B. Evidence from Group-Level Models To mitigate the problem of omitted variable bias, we replicated our individual-level analysis at the group level. A long 41. Specifically, the figure shows coefficients on dummies for deciles of changes in the log of average annual earnings from 1974–1979 to 1980–1986 in a logistic model of death. The omitted category is that for earnings changes in the range [−0.05, 0.05]. Other variables include year effects, a fourth-order polynomial in age, and the average and standard deviation of earnings 1974–1979. To maximize sample sizes, the figure shows results based on the sample that includes older workers (see Table IV). The different lines correspond to different restrictions on presence in the Pennsylvania labor market during the period 1980–1986. All but the coefficients on the dummies [−0.2, −0.05], [0.05, 0.15], and [0.15, 0.3] are statistically significantly different from zero.
JOB DISPLACEMENT AND MORTALITY
1297
FIGURE III Mortality Rate by Size of Earnings Change from 1974–79 to 1980–1986 at Individual Level and at Cell Level (Sample of Men in Stable Employment 1974–1979, Firm 1979 Employment ≥50) (A) Differences in mortality by deciles of the change in average earnings (relative to workers with no change in earnings), alternative degrees of presence in PA labor force in 1980–1986, workers born 1920–1959. Coefficients on dummies for deciles of changes in the log of average annual earnings from 1974–1979 to 1980– 1986 in a logit model of death. The omitted category are earnings changes in the range [−0.05, 0.05]. Other variables include year effects, a quartic in age, and the average and standard deviation of earnings 1974–1979. (B) Effect of displacement on mortality and annual earnings by cells of industry and local unemployment rate in 1979 (28 cells), work every year in PA labor force 1980–1986, workers born 1930–1959. Cells corresponding to model (7) in Table VII. The slope of the regression line in the figure corresponds to the coefficient in the last column of the table. The effect on annual earnings refers to the effect of displacement on changes in the log of average annual earnings from 1974–1979 to 1980–1986 by cell. The effect on mortality rate refers to the effect of displacement on mortality by cell. Both models include cell-level dummies. The regression line is from a regression of mortality effects on earnings effects by cell level weighted by cell size. See the text for further information.
(2)
(1)
Model
All workers
Only displaced
All workers
— — −0.502 (0.160) — — −0.433 (0.186) 0.155 (0.054) −0.054 (0.090)
Only displaced
Work every year in PA during 1980–1986
Panel A: Log-odds model of death as function of displacement and changes in earnings Dummy for job loss during mass layoff −0.022 — −0.011 (0.040) — (0.048) Percent change in long-term average earnings −0.861 −0.672 −0.669 (0.060) (0.096) (0.102) Dummy for job loss during mass layoff −0.041 — −0.060 (0.048) — (0.051) Percent change in long-term average earnings −0.391 −0.432 −0.524 (0.089) (0.150) (0.110) Change in log standard deviation of log earnings 0.201 0.184 0.203 (0.026) (0.046) (0.028) At least one transition to nonemployment 0.012 −0.076 0.025 (0.047) (0.088) (0.048)
Coefficient in logit model
No work restriction during 1980–1986
TABLE VII EFFECT OF JOB DISPLACEMENT ON DEATH CONTROLLING FOR CAREER OUTCOMES AFTER DISPLACEMENT (WORKERS IN STABLE EMPLOYMENT 1974–1979, FIRM 1979 EMPLOYMENT ≥50, BORN 1930–1959)
1298 QUARTERLY JOURNAL OF ECONOMICS
−0.0027 (0.0008) −0.0028 (0.0008) −0.0024 (0.0009) −0.0026 (0.0003) −0.0025 (0.0005)
−0.0046 (0.0003) −0.0046 (0.0003) −0.0047 (0.0003) −0.0046 (0.0003) −0.0046 (0.0003)
−0.0042 (0.0006) −0.0043 (0.0006) −0.0043 (0.0006) −0.0042 (0.0006) −0.0042 (0.0006)
Individual earnings change
−0.0019 (0.0014) −0.0022 (0.0014) −0.0022 (0.0005) −0.0018 (0.0004) −0.0016 (0.0009)
Predicted earnings change
Notes. The models in Panel A show coefficients for logit models of the annual hazard of death controlling for year and industry dummies, a quartic in age, pre–mass layoff career outcomes (log average quarterly earnings, log of standard deviation of log quarterly earnings, number of quarters in nonemployment, average quarterly growth in earnings, all measured from 1974 to 1979), as well as changes in career outcomes from 1974–1979 to 1987–1991 (percent change in the standard deviation of log quarterly earnings, change in the average growth rate of quarterly earnings). The models in Panel B report coefficients on linear probability models of the hazard of death on either the actual individual change in average quarterly earnings or the change in earnings predicted by interactions of cell-level dummies with mass-layoff dummy; all models control for cell level dummies, year dummies, industry dummies, a quartic in age, and pre–mass layoff career outcomes. The coefficient on the individual level earnings change varies across specifications because different cell level dummies are included. Standard errors are in parentheses.
(7) Seven industry groups in 1979 times four groups of county unempl. in 1979
(6) Ten groups of average 1974–1979 employment of 1979 employer
(5) Ten groups of average 1974–1979 earnings of 1979 employer
(3) Four age groups in 1979 times four groups of avg. earnings in 1974–1979 (4) Four age groups in 1979 times seven industry groups in 1979
Cells used to predict change in average quarterly earnings
Predicted earnings change
Individual earnings change
Panel B: Linear probability models of death on individual level change in average earnings and change in average earnings predicted by interaction of displacement and cell-level dummies No work restriction Work every year in during 1980–1986 PA during 1980–1986
TABLE VII (CONTINUED)
JOB DISPLACEMENT AND MORTALITY
1299
1300
QUARTERLY JOURNAL OF ECONOMICS
literature suggests that there are systematic differences in the effects of job loss on earnings,42 some of which arise from industry-, firm-, or job-match-specific rents workers receive on the job that are permanently lost as workers change employers (e.g., von Wachter and Bender [2006]).43 Because at least part of the loss in earnings for workers leaving high-paying industries or large employers is likely to be uncorrelated to health changes at the cell level, we expect the omitted-variable bias of the cell-level estimates to be lower than that of individual-level estimates. To implement the group-level analysis, we use the detailed information available from our administrative data to divide workers into cells based on age at layoff, industry and local labor market conditions before layoff, and average employment size or average wage of the 1979 employer. We then run a regression of increases in the mortality hazard on losses in average earnings at the cell level. The model controls for permanent differences in average health and earnings across groups through cell-level dummies. Similarly, it accounts for a common effect of displacement on all groups through a mass-layoff dummy.44 The results suggest that even when we use cell-level variation, the effect of earnings losses on mortality is economically and statistically significant. Panel B of Table VII shows the coefficient estimates on actual earnings changes in a linear probability model as a benchmark, as well as the slope coefficient in the cell-level model. The effect of earnings losses on mortality at the cell level is 40%–50% of the effect at the individual level for the samples with no or a low work requirement. The cell-level effect is somewhat smaller but still statistically and economically significant for the high-attachment sample, yet standard errors increase as 42. See, for example, Kletzer (1989), JLS (1993), Neal (1995), and Farber (2003). 43. Such rents could arise due to contractual premiums (Beaudry and DiNardo 1991), job search (Topel and Ward 1992), and firm, industry, and regional wage premiums (e.g., Krueger and Summers [1988]; Abowd, Creecy, and Kramarz [2002]). 44. The cell-level model is implemented in two steps. First, we regressed individual earnings and mortality on characteristics such as baseline earnings, baseline standard deviation, a quartic in age, and year effects; in addition, we included cell-level dummies and interactions between cell-level dummies and the mass-layoff dummy. The coefficients on the interaction are used in the second step. Second, we regressed cell-level changes in mortality on cell-level earnings losses, weighting by the inverse sampling variance of the cell-specific mortality increases. The resulting slope coefficient is equal to the two-stage least-squares estimate from using the interactions of group dummies and mass-layoff dummy as an instrument for earnings. See Section 3 of our longer working paper for the relevant equations and further discussion (Sullivan and von Wachter 2007).
JOB DISPLACEMENT AND MORTALITY
1301
well. Our findings are quite similar for the different definitions of cells shown in the table (models (3)–(7)). Panel B in Figure III displays the corresponding cell-level averages and a linear regression line for the high-attachment sample for the results of model (7) in Table VII. The figure displays a clear negative and relatively precise association between earnings losses and mortality increases at the cell level, although there is an important degree of variation left. Overall, the results confirm that earnings losses after job displacement appear to be strongly correlated with increases in mortality rates.45 Because we cannot exclude that health and earnings responses to layoff may be correlated across cells as well, we do not interpret the resulting cell-level estimates as causal effects of earnings changes on the long-term mortality hazard. IV.C. Earnings Losses and Life Expectancy The analysis thus far has concentrated on the impact of longterm declines in quarterly earnings, irrespective of their duration. However, standard models of health investment refer to lifetime resources as the relevant earnings concept. Thus, we calculated the present discounted value (PDV) of lifetime earnings losses following a job loss in a mass layoff by age group (Sullivan and von Wachter, 2007).46 We then compared losses in life expectancy taken from Table V with the corresponding PDV of earnings losses by age group. The correlation is strong, monotonic, and numerically large; as the percentage loss in the PDV of earnings doubles for thirty- to forty-year-olds relative to fifty- to sixty-year-olds, life expectancy declines by about 25%. Young workers have not only 45. The mean squared error of the group-level regressions is a test statistic for overidentification with a limiting chi-squared distribution with degrees of freedom equal to the number of cells minus the number of parameters (e.g., Angrist [1991]). In results not reported here, for none of our group-level models can we reject the overidentifying restrictions at any reasonable level of statistical significance. This supports an underlying model of a constant proportional effect of earnings on mortality and gives no indication of endogeneity of cell-level earnings changes with respect to mortality that differs across cells. The fact that our cell-level estimates tend to be smaller than individual-level estimates suggests that the latter indeed are affected by omitted-variable bias common across groups. 46. To do so, we allowed both the short- and long-term effects of job loss on earnings to vary by either ten-year or five-year age groups. Because we do not have complete earnings histories after job loss for all workers, we assume that the earnings loss decays at the same speed of reversion observed between years six and eleven after a job loss, and eventually stays fixed at zero. The resulting values are robust to alternative specifications. The percentage decreases in the present discounted value of earnings of our preferred specification for the age groups in Table V are, going from youngest to oldest, 12.3%, 10.8%, 9.2%, 7.4%, 5.6%, and 3.8%, respectively.
1302
QUARTERLY JOURNAL OF ECONOMICS
higher average mortality increases after a job loss (Table IV), but also the largest losses in remaining lifetime (Table V) and in lifetime earnings. Thus, it is not the oldest workers who are most affected, but those at prime working age who are exposed to the negative consequences of job loss over a longer period of time. Overall, we interpret the results in Section IV as suggesting that at least for job losses involving earnings declines as dramatic as those in PA in the early to mid-1980s, the sources of the increase in mortality are likely to be associated with longterm losses in average earnings and increases in the variability of earnings. This may include direct effects of reduced earnings and increased variability, but clearly also include stress, adjustment costs, and other factors correlated with both long-term earnings declines and mortality. V. CONCLUSION This paper uses administrative data covering over fifteen years of quarterly earnings and employer records matched to information on date of death to study the effects of job displacement on mortality of high-seniority male workers losing their jobs in PA in the early to mid-1980s. To measure an event plausibly exogenous to workers’ own health outcomes, we analyze job losses occurring when employers experience mass layoffs affecting at least 30% of their work forces. To further control for selection, we also control for workers’ average earnings and a range of career outcomes in the period before job loss and present selection-free estimates pooling movers and stayers. The results suggest a particularly pronounced increase in mortality during the period immediately following job loss and a long-run increase of 10%–15% in the annual probability of dying persisting for at least the next twenty years. These effects, robust across alternative samples and specifications, are consistent with strong responses to both acute and chronic stress associated with worsened labor market opportunities. To assess the channels underlying the mass-layoff effect, we analyze the correlation of long-run career outcomes with mortality. We show that the mean and standard deviation of earnings during a baseline period have high and significant correlations with mortality in a later follow-up period. Together with estimates of the effects of mass layoffs on long-run career outcomes, these results suggest that an important fraction of the effect of job
JOB DISPLACEMENT AND MORTALITY
1303
loss on mortality can be attributed to persistent losses in earnings. This is confirmed by a direct analysis of differences in mortality responses by groups of workers with differential earnings losses at job displacement associated with industry or employer affiliation before displacement. These results suggest that events in the labor market shaping workers’ careers also have long-run effects on health outcomes. The losses in life expectancy implied by our estimates show that these effects can be large. A worker displaced in mid-career can expect to live about one and one-half years less than a nondisplaced counterpart. The reduction in life expectancy is smaller for older workers, who experience lower lifetime earnings losses and are exposed to increased mortality for a shorter period of time. Our results do not speak to the role of noneconomic factors such as stress, self-worth, and happiness. Yet they suggest that an important avenue for future research would be to examine whether the negative health consequences of mass layoffs can be prevented by providing assistance that stabilizes the level and variance of earnings. Similarly, although the experience of displaced workers has been found to be similar in other states and time periods, it is important to replicate our study of male workers displaced in PA in the 1980s for other regions and time periods, and for women. Finally, our results are not in conflict with recent work suggesting that mortality declines during recessions, possibly because of healthier lifestyles and a reduction in accidents related to work or commuting (Ruhm 2000). First, although recessions do increase the number of high-tenure displaced workers, whose mortality we find to be elevated, such workers are a small fraction of those affected by economic downturns.47 Second, Ruhm (2000) focuses on fluctuations in mortality that are contemporaneous with cyclical fluctuations in economic activity, whereas the bulk of the effects we observe take place many years after displacement. Finally, from the perspective of the aggregate economy, a recession is a relatively minor event that only marginally reduces the present value of lifetime income for the representative workerconsumer and at the same time provides a modest increase in leisure. For an individual high-tenure worker, however, job loss is a major economic setback that significantly reduces lifetime 47. See, for example, Aaronson and Sullivan (1998). The gains in health during recessions measured by Ruhm (2000) may be due to changes in hours worked by employed workers or to changes in employment rates of those with less strong job attachment.
1304
QUARTERLY JOURNAL OF ECONOMICS
income, without a corresponding reduction in work activity. Thus, the workers we study, although having fewer lifetime resources, did not enjoy the increases in leisure, healthier lifestyles, or reductions in accidents that may explain Ruhm’s results. APPENDIX MORTALITY RATES IN DIFFERENT PERIODS BY DISPLACEMENT STATUS AND BY SIZE OF EMPLOYMENT DROP OF 1979 EMPLOYER Displaced Non–mass Displaced Displaced more than Range All Same firm layoff 30%–60% 60%–90% 90% below of years workers 1974 to 1986 separators below peak below peak peak 1987– 2006 1987– 1991 1992– 1996 1997– 2006
6.764 (0.143) 4.167 (0.181) 7.407 (0.227) 10.815 (0.427)
Panel A: No work restriction in 1980–1986 6.013 7.253 7.191 7.764 (0.191) (0.362) (0.384) (0.435) 3.446 4.334 5.012 5.132 (0.234) (0.451) (0.516) (0.569) 6.790 7.837 7.520 8.033 (0.308) (0.571) (0.596) (0.671) 9.639 12.163 11.129 12.847 (0.570) (1.108) (1.130) (1.324)
8.449 (0.640) 5.518 (0.830) 9.677 (1.039) 11.892 (1.803)
1987– 2006 1987– 1991 1992– 1996 1997– 2006
6.343 (0.152) 3.745 (0.189) 6.994 (0.242) 10.347 (0.458)
Panel B: Work every year 1980–1986 6.013 6.682 6.219 7.319 (0.191) (0.434) (0.444) (0.509) 3.446 3.762 4.134 4.798 (0.234) (0.526) (0.583) (0.664) 6.790 6.990 6.680 7.639 (0.308) (0.673) (0.698) (0.789) 9.639 12.211 9.581 12.010 (0.570) (1.383) (1.298) (1.541)
7.645 (0.733) 4.202 (0.875) 8.790 (1.191) 12.351 (2.205)
Notes. Deaths per thousand per year. Standard errors in parentheses. Panel A: Displaced workers left jobs in a year in which their former firms’ employment was 30% or more below its 1974–1979 peak. Panel B: Displaced workers left jobs in a year in which their former firms’ employment was 30% or more below its 1974–1979 peak; nondisplaced workers remained at their 1979 firms through 1986.
RESEARCH DEPARTMENT, FEDERAL RESERVE BANK OF CHICAGO DEPARTMENT OF ECONOMICS, COLUMBIA UNIVERSITY, NBER, CEPR, AND IZA
REFERENCES Aaronson, Daniel, and Daniel Sullivan, “The Decline of Job Security in the 1990s: Displacement, Anxiety, and Their Effect on Wages,” Federal Reserve Bank of Chicago, Economic Perspectives, 22 (1998), 17–43. Abowd, John, Robert Creecy, and Francis Kramarz, “Computing Person and Firm Effects Using Linked Longitudinal Employer–Employee Data,” mimeo, Cornell University, 2002.
JOB DISPLACEMENT AND MORTALITY
1305
Abraham, Katharine G., and James L. Medoff, “Length of Service and Layoffs in Union and Nonunion Work Groups,” Industrial Labor Relations Review, 38 (1984), 87–97. Angrist, Joshua, “Grouped Data Estimation and Testing in Simple Labor Supply Models,” Journal of Econometrics, 47 (1991), 243–266. Beaudry, Paul, and John DiNardo, “The Effect of Implicit Contracts on the Movement of Wages over the Business Cycle: Evidence from Micro Data,” Journal of Political Economy, 99 (1991), 665–688. Black, Dan, Kermit Daniel, and Seth Sanders, “The Impact of Economic Conditions on Participation in Disability Programs: Evidence from the Coal Boom and Bust,” American Economic Review, 92 (2002), 27–50. Browning, Martin, Anne Moller Dano, and Eskil Heinesen, “Job Displacement and Stress-Related Health Outcomes,” Health Economics, 15 (2006), 1061–1075. Burgard, Sarah, Jennie Brand, and James House, “Causation and Selection in the Relationship of Job Loss to Health in the United States,” mimeo, University of Michigan, 2005. ——, “Toward a Better Estimation of the Effect of Job Loss on Health,” Journal of Health and Social Behavior, 48 (2007), 369–384. Card, David, Raj Chetty, and Andrea Weber, “Cash-on-Hand and Competing Models of Intertemporal Behavior: New Evidence from the Labor Market,” Quarterly Journal of Economics, 122 (2007), 1511–1560. Chan, Sewin, and Ann Huff Stevens, “Job Loss and Employment Patterns of Older Workers,” Journal of Labor Economics, 19 (2001), 484–521. Chen, Susan, and Wilbert van der Klaauw, “The Work Disincentive Effects of the Disability Insurance Program in the 1990s,” Journal of Econometrics, 142 (2008), 757–784. Couch, Kenneth, and Dana Placzek, “The Earnings Impact of Job Displacement Measured with Longitudinally Matched Individual and Firm Data,” American Economic Review (forthcoming). Cutler, David, Angus Deaton, and Adriana Lleras-Muney, “The Determinants of Mortality,” Journal of Economic Perspectives, 20 (2006), 97–120. Darity, William, and Arthur Goldsmith, “Social Psychology, Unemployment, and Macroeconomics,” Journal of Economic Perspectives, 10 (1996), 121–140. Deaton, Angus, “Inequalities in Income and Inequalities in Health,” in The Causes and Consequences of Increasing Inequality, Finis Welch, ed. (Chicago: University of Chicago Press, 2001). Deaton, Angus, and Christina Paxson, “Mortality, Education, Income and Inequality among American Cohorts,” NBER Working Paper No. 7140, 1999. Efron, Bradley, “Logistic Regression, Survival Analysis, and the Kaplan–Meier Curve,” Journal of the American Statistical Association, 83 (1988), 414–425. Eliason, Marcus, and Donald Storrie, “Latent or Lasting Scars: Swedish Evidence on the Long-Term Effects of Job Displacement,” Journal of Labor Economics, 24 (2006), 831–856. ——, “Does Job Loss Shorten Life?” G¨oteborg University Department of Economics Working Paper Series No. 153, 2007. Farber, Henry, “Job Loss in the United States, 1981–2001,” Princeton University Industrial Relations Section Working Paper No. 471, 2003. Gallo, W. T., H. M. Teng, T. A. Falba, S. V. Kasl, H. M. Krumholz, and E. H. Bradley, “The Impact of a Late-Career Job Loss on Myocardial Infarction and Stroke: A 10-Year Follow-Up Using the Health and Retirement Survey,” Occupational and Environmental Medicine, 63 (2006), 683–687. Gruber, Jonathan, “The Consumption Smoothing Benefit of Unemployment Insurance,” American Economic Review, 87 (1997), 192–205. Hildreth, Andrew, Till von Wachter, and Elizabeth Weber Handwerker, “Estimating the ‘True’ Cost of Job Loss: Evidence Using Matched Data from California 1991–2000,” mimeo, Columbia University, 2008. Hill, Mark E., and Ira Rosenwaike, “The Social Security Administration’s Death Master File: The Completeness of Death Reporting at Older Ages,” Social Security Bulletin, 64 (2002), 45–51. Jacobson, Louis, Robert LaLonde, and Daniel Sullivan, “Earnings Losses of Displaced Workers,” American Economic Review, 83 (1993), 685–709.
1306
QUARTERLY JOURNAL OF ECONOMICS
Kletzer, Lori, “Returns to Seniority after a Permanent Job Loss,” American Economic Review, 79 (1989), 536–43. Kodrzycki, Yolanda K., “Using Unexpected Recalls to Examine the Long-Term Earnings Effects of Job Displacement,” Federal Reserve Bank of Boston Working Paper 07-2, 2007. Krueger, Alan, and Andreas Mueller, “The Lot of the Unemployed: A Time Use Perspective,” IZA Discussion Paper No. 3490, 2008. Krueger, Alan, and Lawrence Summers, “Efficiency Wages and the Inter-industry Wage Structure,” Econometrica, 56 (1988), 259–293. ¨ Kuhn, Andreas, Rafael Lalive, and Josef Zweimuller, “The Public Health Costs of Unemployment,” Cahiers de Recherches Economiques du D´epartement d’Econom´etrie et d’Economie politique (DEEP) 07.08, Universit´e de Lausanne, Facult´e des HEC, DEEP, 2007. Lindahl, Mikael, “Estimating the Effect of Income on Health and Mortality Using Lottery Prizes as Exogenous Source of Variation in Income,” Journal of Human Resources, 40 (2005), 144–168. ¨ ¨ Martikainen, Pekka, Netta Maki, and Markus Jantti, “The Effects of Unemployment on Mortality Following Workplace Downsizing and Workplace Closure: A Register-Based Follow-Up Study of Finnish Men and Women during Economic Boom and Recession,” American Journal of Epidemiology, 165 (2007), 1070–1075. Martikainen, Pekka, and Tapani Valkonen, “Mortality after Death of a Spouse in Relation to Duration of Bereavement in Finland,” Journal of Epidemiology and Community Health, 50 (1996), 264–268 National Center for Health Statistics, “United States Life Tables, 2003,” National Vital Statistics Reports, 54 (2006), 1–40. Neal, Derek, “Industry-Specific Human Capital: Evidence from Displaced Workers,” Journal of Labor Economics, 13 (1995), 653–677. Olson, Craig, “The Impact of Permanent Job Loss on Health Insurance Benefits,” Princeton University Industrial Relations Section Working Paper No. 684, 1992. Rege, Mari, Kjetil Telle, and Mark Votruba, “The Effect of Plant Downsizing on Disability Pension Utilization,” Journal of European Economic Association, forthcoming. Ruhm, Christopher, “Are Workers Permanently Scarred by Job Displacements?” American Economic Review, 81 (1991), 319–24. ——, “Are Recessions Good for Your Health?” Quarterly Journal of Economics, 115 (2000), 617–650. Schoeni, Robert, and Michael Dardia, “Estimates of Earnings Losses of Displaced Workers Using California Administrative Data,” PSC Research Report No. 03–543, 2003. Smith, James, “Healthy Bodies and Thick Wallets: The Dual Relation between Health and Economic Status,” Journal of Economic Perspectives, 13 (1999), 145–166. Stevens, Ann Huff, “Persistent Effects of Job Displacement: The Importance of Multiple Job Losses,” Journal of Labor Economics, 15 (1997), 165–188. Sullivan, Daniel, and Till von Wachter, “Mortality, Mass-Layoffs, and Career Outcomes: An Analysis Using Administrative Data,” NBER Working Paper No. 13626, 2007. ——, “Average Earnings and Long-Term Mortality: Evidence from Administrative Data,” American Economic Review: Papers and Proceedings, 99 (2009), 133– 138. Topel, Robert, and Michael Ward, “Job Mobility and the Careers of Young Men,” Quarterly Journal of Economics, 107 (1992), 439–479. Von Wachter, Till, and Stefan Bender, “At the Right Place at the Wrong Time: The Role of Firms and Luck in Young Workers’ Careers,” American Economic Review, 96 (2006), 1679–1705. Von Wachter, Till, Jae Song, and Joyce Manchester, “Long-Term Earnings Losses Due to Mass Layoffs during the 1982 Recession: An Analysis Using Longitudinal Administrative Data from 1974 to 2004,” mimeo, Columbia University, 2009.
TRUST AND SOCIAL COLLATERAL∗ DEAN KARLAN MARKUS MOBIUS TANYA ROSENBLAT ADAM SZEIDL This paper builds a theory of trust based on informal contract enforcement in social networks. In our model, network connections between individuals can be used as social collateral to secure informal borrowing. We define networkbased trust as the largest amount one agent can borrow from another agent and derive a reduced-form expression for this quantity, which we then use in three applications. (1) We predict that dense networks generate bonding social capital that allows transacting valuable assets, whereas loose networks create bridging social capital that improves access to cheap favors such as information. (2) For job recommendation networks, we show that strong ties between employers and trusted recommenders reduce asymmetric information about the quality of job candidates. (3) Using data from Peru, we show empirically that network-based trust predicts informal borrowing, and we structurally estimate and test our model.
I. INTRODUCTION A growing body of research demonstrates the importance of trust for economic outcomes.1 Arrow (1974) calls trust “an important lubricant of a social system.” If trust is low, poverty can persist because individuals are unable to acquire capital, even if they have strong investment opportunities. If trust is high, informal transactions can be woven into daily life and help generate efficient allocations of resources. But what determines the level of trust between individuals? In this paper we propose a model where the social network influences how much agents trust each other. Sociologists such as Granovetter (1985), Coleman (1988), and Putnam (2000) have long argued that social networks play an important role in ∗ We thank Attila Ambrus, Susan Athey, Antoni Calvó-Armengol, Pablo CasasArce, Rachel Croson, Avinash Dixit, Drew Fudenberg, Andrea Galeotti, Ed Glaeser, Sanjeev Goyal, Daniel Hojman, Matthew Jackson, Rachel Kranton, Ariel Pakes, Andrea Prat, Michael Schwarz, Andrei Shleifer, Andy Skrzypacz, Fernando VegaRedondo, and seminar participants for helpful comments.
[email protected],
[email protected],
[email protected],
[email protected]. 1. Trust has been linked with outcomes including economic growth (Knack and Keefer 1997), judicial efficiency and lack of corruption (La Porta et al. 1997), international trade and financial flows (Guiso, Sapienza, and Zingales 2009), and private investment (Bohnet, Herrman, and Zeckhauser 2008).
C 2009 by the President and Fellows of Harvard College and the Massachusetts Institute of
Technology. The Quarterly Journal of Economics, August 2009
1307
1308
QUARTERLY JOURNAL OF ECONOMICS
building trust.2 In our model, networks create trust when agents use connections as social collateral to facilitate informal borrowing. The possibility of losing valuable friendships secures informal transactions in the same way that the possibility of losing physical collateral can secure formal lending.3 Because both direct and indirect connections can serve as social collateral, the level of trust is determined by the structure of the entire network. Although we present our model in terms of trust in a borrowing transaction, it can also apply to other situations that involve moral hazard or asymmetric information, such as hiring workers through referrals.4 To understand the basic logic of our model, consider the examples in Figure I, where agent s would like to borrow an asset, such as a car, from agent t, in an economy with no formal contract enforcement. In Figure IA, the network consists only of s and t; the value of their relationship, which represents either the social benefits of friendship or the present value of future transactions, is assumed to be 2. As in standard models of informal contracting, t will only lend the car if its value does not exceed the relationship value of 2. More interesting is Figure IB, where s and t have a common friend u, the value of the friendship between s and u is 3, and that between u and t is 4. Here, the common friend increases the borrowing limit by min[3, 4] = 3, the weakest link on the path connecting borrower and lender through u, to a total of 5. The logic is that the intermediate agent u vouches for the borrower, acting as a guarantor of the loan transaction. If the borrower chooses not to return the car, he is breaking his promise of repayment to u, and therefore loses u’s friendship. Because the value of this friendship is 3, it can be used as collateral for a payment of up to 3. For the lender t to receive this amount, u must prefer transmitting the 2. Glaeser et al. (2000) show in experiments that social connections increase trust. Field evidence on the role of networks in trust-intensive exchange includes McMillan and Woodruff (1999) and Johnson, McMillan, and Woodruff (2002) for business transactions in Vietnam and transition countries; Townsend (1994) and Udry (1994) for insurance arrangements in India and Nigeria; and Macaulay (1963) and Uzzi (1999) for firms in the United States. 3. We abstract from morality, altruism, and other mechansisms that can generate trust even between strangers (e.g., Berg, Dickhaut, and McCabe [1995], Fukuyama [1995]); hence our definition of trust is like Hardin’s (1992). 4. In related work, Kandori (1992), Greif (1993), and Ellison (1994) develop models of community enforcement where deviators are punished by all members of society. More recently, Dixit (2003), Lippert and Spagnolo (2006), Ali and Miller (2008), and Bloch, Genicot, and Ray (2008) have explored models of informal contracting where networks are used to transmit information. In contrast, in our work the network serves as social collateral.
1309
TRUST AND SOCIAL COLLATERAL A. Two-agent network 2 s
t
B. Common friend
3
C. No direct link
3
4 u
s
4 u
2 t
s
2
1
t
v
FIGURE I Social Collateral in Simple Networks This figure illustrates the calculation of trust in simple networks. In all three panels, agent s wishes to borrow an asset from agent t. In Panel A, both agents are direct friends and the borrowing limit is equal to 2, the strength of their relationship. In Panel B, their relationship is strengthened by a common friend u and the borrowing limit increases by min[3, 4] = 3, which is the value of the weakest link on the path connecting s and t through u. In Panel C, borrower and lender are not direct friends and the borrowing limit is the sum of the weakest links on the two paths between s and t. See the text for details.
payment to losing the friendship with him, explaining the role of the weakest link. Our main theoretical result is that in general networks, the level of trust equals the sum of the weakest link values over all disjoint paths connecting borrower and lender. This quantity is called the maximum network flow, a well-studied concept in graph theory.5 Intuitively, the maximum flow is the largest amount that can flow from borrower to lender along the edges of the network, respecting the capacity constraints given by link values. This concept does not require the borrower and the lender to be directly linked; for example, in Figure IC, where s and t are not connected but share two common friends, the borrowing limit is the sum of the weakest links on the two paths connecting s and t, min[3, 4] + min[2, 1] = 4, because each intermediate agent can vouch for part of the value of the car. The key idea in the proof of our main result is to characterize coalition-proof informal 5. See Cormen et al. (2001) for a textbook treatment.
1310
QUARTERLY JOURNAL OF ECONOMICS
contracts using the maximum flow–minimum cut theorem (Ford and Fulkerson 1956), a famous result in computer science. The paper also develops three applications of this social collateral model. The first application, which explores the effect of network structure on welfare, helps reconcile two seemingly competing views of sociologists. Coleman (1988) emphasizes the benefits of networks with high closure, where connected agents share many common friends, which facilitate the enforcement of cooperation. In contrast, Burt (1995) argues that loose networks, that is, low closure, are better, because they provide greater access to information and other resources. The social collateral model can reconcile these views by identifying a trade-off between trust and access, which implies that the relative benefit of high or low closure depends on the value of the assets being transacted. Closure is more attractive when agents tend to exchange valuable assets, because it maximizes trust among a small number of individuals. This is in line both with Coleman’s general argument and with his example of diamond dealers in New York, who exchange valuable stones in a tight network of family and religious ties. In contrast, when the network is mainly used to exchange small favors such as giving information or advice, large and loose neighborhoods are better because they maximize access to these resources. These results also provide foundations and network-based measures for Putnam’s (2000) concepts of bonding versus bridging social capital and have implications for the design of organizations. In a second application, we study the implications of networkbased trust for job recommendations. It is well known that many jobs are found through social networks (Ioannides and Loury 2004). A common explanation is that information about job openings spreads through friends and acquaintances. This “strength of weak ties” argument, made by Granovetter (1973), predicts that weak links to agents with whom one has few common friends are most useful for job search, because they provide access to otherwise unobtainable information. However, the evidence is mixed: many studies find that strong ties in dense networks are more important. Our model suggests a reason for the strength of strong ties in job search: trusted recommenders can reduce asymmetric information about job candidates. In the social collateral model, networks do help identify high-type workers, but only if the trust flow between the recommenders and the employer exceeds the sensitivity of profits to worker type. Recommendations from less
TRUST AND SOCIAL COLLATERAL
1311
trusted individuals are not credible, because a low-type candidate can bribe the recommender to put in a good word for him. This result implies that the relative importance of weak versus strong connections should vary as a function of the skill sensitivity of the job, which can help explain the mixed evidence about weak ties. We also obtain new predictions: agents hired through the network should earn higher wages; this wage gap should be increasing in the skill intensity of the job; and employers should rely more on social networks to fill skill-intensive vacancies. Although these predictions do not emerge in a model of information transmission about vacancies, they are consistent with existing evidence, suggesting that trusted referrals can be important for understanding job search. In the third, empirical application, we estimate and test our model using a unique data set on social networks and informal lending in two low-income shantytowns in Peru. In these communities, informal borrowing is very common, making the data an ideal fit for our theory. For example, 46% of households have recently borrowed money from others in their immediate social networks. We estimate the social collateral model in this data using a discrete choice framework, which allows us to back out the relative strength of network links as a function of time spent together, and establish three results. (1) Confirming our main prediction, we document a strong positive correlation between social collateral and borrowing, which is primarily driven by strong ties. For example, increasing trust flow by a link in the top one-third of the distribution of time spent together increases the probability of borrowing by a factor of 2.7. (2) We show that direct and indirect paths have similar effects on borrowing, demonstrating the importance of network closure for building trust. (3) We verify the key structural implication that borrowing should be determined by the weakest link on a path. Our results are inconsistent with alternative explanations such as altruism or information transmission, which do not predict that indirect paths should matter through the value of the weakest link. Taken together, we find strong support for the social collateral model; our results also suggest that strong ties and high closure, i.e., bonding social capital, are particularly important for borrowing. The rest of the paper is organized as follows. Section II collects motivating evidence about the social collateral mechanism. Section III develops the model and derives the reduced form expression for trust. Section IV presents our theoretical applications
1312
QUARTERLY JOURNAL OF ECONOMICS
and Section V our empirical application. Section VI concludes by sketching some other applications. All proofs are in the Appendix. II. SOCIAL COLLATERAL: SUGGESTIVE EVIDENCE This section presents evidence about social networks and informal contract enforcement. It is a well-documented fact that social networks are often used for trust-intensive exchanges.6 In this section, we focus on documenting anecdotal evidence about the mechanism through which networks create the trust necessary for these transactions. We begin with an example originally attributed to Wechsberg (1966), which we take from Coleman (1990). This example is of a prominent Norwegian shipowner who was in need of a ship that had undergone repairs in an Amsterdam shipyard. However, “the yard would not release the ship unless a cash payment was made of 200,000 pounds. Otherwise the ship would be tied up for the weekend, and the owner would lose at least twenty thousand pounds.” The shipowner was in trouble, because he could not have 200,000 pounds delivered immediately to Amsterdam. To solve this problem, he called a London banker at Hambros, who presumably had contacts in Amsterdam. After hearing the situation, “the Hambros man looked at the clock and said, ‘It’s getting late but I’ll see whether I can catch anyone at the bank in Amsterdam . . . stay at the phone.’ Over a second phone he dictated to a secretary in the bank a telex message to the Amsterdam bank: ‘Please pay 200,000 pounds telephonically to (name) shipyard on understanding that (name of ship) will be released at once.’ ” In this example, the shipowner borrowed 200,000 pounds on immediate notice from an Amsterdam bank with which he had no direct connection. He accomplished this by collateralizing two business relations: his connection with the London banker, and the connection between the London and Amsterdam banks. In Coleman’s (1990) terminology, the London banker acted as a “trust intermediary”: by vouching for the shipowner, he provided access and created the necessary trust for the transaction. If the shipowner were to default, the Amsterdam bank could ask the London banker to pay compensation or risk jeopardizing their relationship; and similarly, the London banker could presumably 6. For references, see the citations in footnote 2, as well as Table 1 in our working paper, Mobius and Szeidl (2007).
TRUST AND SOCIAL COLLATERAL
1313
extract money from the shipowner if needed. This is how the two business relations were used as collateral to secure borrowing. A second example of the mechanism through which networks generate trust is the guanxi system in China. Guanxi refers to a trusted relationship that can be used to obtain services either directly or indirectly from a person’s social network.7 Guanxi often serves as a substitute for legal contracts, and helps overcome institutional weaknesses of the Chinese legal system (Fock and Woo 1998). To understand the mechanism of guanxi, consider Standifird and Marshall’s (2000) example of a buyer and a seller who share guanxi with a common acquaintance. This third person can act as zhongjian ren, essentially an intermediary, by introducing the buyer to the supplier. In this transaction, the zhongjian ren vouches for the buyer by assuring the supplier that he will be compensated for any sunk investments required for the relationship (e.g., preparing blueprints or samples). If the buyer exploits the supplier, the intermediary will be held responsible; and unless reparations are made, this can damage the relationship between the intermediary and either business partner. This example illustrates the collateral role of guanxi: parties refrain from cheating because it would limit their future exchanges with the intermediary whose guanxi they borrowed. Both of these examples highlight the role of vouching intermediaries and the collateral function of connections in securing transactions. We now develop a model that formalizes these ideas. III. THEORY This section presents a game-theoretic model of informal borrowing in social networks, and shows that the highest loan amount is limited by the maximum network flow (or trust flow) between borrower and lender. In Sections IV and V, where we consider applications, we make use of this reduced form characterization of trust. III.A. Model Setup In our model, a borrower needs an asset of a lender to produce a social surplus. This asset might represent a factor of production, 7. The original meaning of the phrase “guan-xi” is using relationships to gain indirect access to a wider network. “Guan” means gate or hurdle and “xi” refers to a relationship; guanxi is thus a gateway to other relationships.
1314
QUARTERLY JOURNAL OF ECONOMICS Stage 2 Borrowing arrangement Stage 1 Realization of needs
Stage 4 Transfer payments
Stage 3 Repayment
Stage 5 Friendship utility
FIGURE II Model Timeline
such as a farming tool, a vehicle, or an animal; it could also be an apartment, a household durable good, or simply a cash payment. In the absence of legal contract enforcement, borrowing must be secured by an informal arrangement. In our model, the social network is used for this purpose: connections in the network have associated consumption value, which serve as “social collateral” to enable borrowing. Formally, a social network G = (W, E) consists of a set W of agents (vertices or nodes) and a set E of edges (links), where an edge is an unordered pair of distinct vertices. Each link in the network represents a friendship or business relationship between the two parties involved. We formalize the strength of relationships using an exogenously given capacity c(u, v). DEFINITION 1. A capacity is a function c : W × W → R such that c(u, v) > 0 if (u, v) ∈ E and c(u, v) = 0 otherwise. The capacity measures the utility benefits that agents derive from their relationships. For ease of presentation, we assume that the strength of relationships is symmetric, so that c (u, v ) = c (v, u) for all u and v.8 Our model consists of five stages, which are depicted in Figure II. We begin by describing the model and then discuss the economic content of our modeling assumptions. Stage 1: Realization of Needs. Two agents s and t are randomly selected from the social network. Agent t, the lender, has an asset that agent s, the borrower, desires. The lender values the asset at V , and it is assumed that V is drawn from some prior distribution F over [0, ∞). The identity of the borrower and the lender and the value of V are publicly observed by all players. 8. Our results extend to the case where capacities are asymmetric. In that environment, the social network can be represented as a directed graph and the directed network flow determines borrowing.
TRUST AND SOCIAL COLLATERAL
1315
Stage 2: Borrowing Arrangement. At this stage, the borrower publicly proposes a transfer arrangement to all agents in the social network. The role of this arrangement is to punish the borrower and compensate the lender in the event of default. A transfer arrangement consists of a set of transfer payments h (u, v ) for all u and v agents involved in the arrangement. Here h (u, v ) is the amount u promises to pay v if the borrower fails to return the asset to the lender. Once the borrower has announced the arrangement, all agents involved have the opportunity to accept or decline. If all involved agents accept, then the asset is borrowed and the borrower earns an income ω (V ), where ω is a nondecreasing function with ω (0) = 0. If some agents decline, then the asset is not lent, and the game moves on directly to stage 5. Stage 3: Repayment. Once the borrower has made use of the asset, he can either return it to the lender or steal it and sell it for a price of V .9 If the borrower returns the asset, then the game moves to the final stage 5. Stage 4: Transfer Payments. All agents observe whether the asset was returned in the previous stage. If the borrower did not return the asset, then the transfer arrangement is activated. Each agent has a binary choice: either he makes the promised payment h (u, v ) in full or he pays nothing. If some agent u fails to make a prescribed transfer h (u, v ) to v, then he loses his friendship with agent v (i.e., the (u, v ) link “goes bad”). If (u, v ) link is lost, then the associated capacity is set to zero for the remainder of the game. We let c (u, v ) denote the new link capacities after these changes. Stage 5: Friendship Utility. At this stage, agents derive utility from their remaining friends. The total utility enjoyed by an agent u from his remaining friends is simply the sum of the values of all c (u, v ). remaining relationships, that is, v Our model is a multistage game with observed actions. Let u denote the set of agent u’s pure strategies and let = ×uu. We focus on the set of pure strategy subgame perfect equilibria below.
III.B. Discussion of Modeling Assumptions We now discuss some of the assumptions underlying our model. 9. As we show in Appendix II, the model can be extended to the case where the liquidation value of the asset is φ · V with φ ≤ 1.
1316
QUARTERLY JOURNAL OF ECONOMICS
Social Sanctions. When an agent fails to make a promised transfer, we assume that the associated friendship link automatically goes bad, capturing the idea that friendly feelings often cease to exist if a promise is broken. Appendix II develops explicit micro foundations for this assumption. In these micro foundations, which build on Dixit (2003), failure to make a transfer is a signal that the agent no longer values his friend, in which case these former friends find it optimal not to interact with each other in the future. An alternative justification is that people break a link for emotional or instinctive reasons when a promise is not kept; Fehr and Gachter (2000) provide evidence for such behavior. Circle of Trust. For large social networks it can be unrealistic for the borrower to include socially distant agents in the arrangement. All our results hold if we restrict the set of links over which transfer payments can be proposed to some subgraph of the original network, the “circle of trust,” which may depend on the identity of the borrower and the lender. The only difference in our results is that the network flow measure of the borrowing limit will have to be computed in the subgraph of permissible links. Transfer Arrangement as Social Norms. The transfer arrangement in our model can be interpreted either as an explicit agreement between all parties or as a representation of accepted norms of behavior. In the second interpretation, agents simply share an understanding about what they are expected to do in the event of default. Cash Bonds and Borrowing Constraints. One way to solve the moral hazard problem is to have the borrower post a cash bond to the lender, which is returned only if the borrower does not default on the asset loan. We abstract away from bonds and prepayments by assuming that the borrower is initially cash-constrained. However, we do allow the borrower and other agents to make payments in later stages of the game. This can be justified if agents work or make investments in the initial stage, generating income in later stages; or if transfers are in-kind, for example, helping out with the harvest, where posting a bond may be inefficient or infeasible. III.C. Equilibrium Analysis For what values of V can borrowing be implemented in a subgame perfect equilibrium? We begin answering this question
TRUST AND SOCIAL COLLATERAL
1317
by studying equilibria where all promises are kept, that is, where every transfer h (u, v ) is expected to be paid if the borrower fails to return the asset. We later show that focusing on these equilibria is without loss of generality. In any equilibrium where promises are kept, transfers have to satisfy the capacity constraint (1)
h(u, v) ≤ c(u, v).
This is simply an incentive compatibility constraint. If the borrower fails to return the asset, agent u has to decide whether to make his promised transfer payment h (u, v ) to v. The cost of making the payment is h (u, v ); the cost of not making the payment is c (u, v ), because it results in losing the friendship with v. In any equilibrium where promises are kept, u must prefer the friendship over the monetary value of the transfer, leading to (1). Two-Agent Network. To build intuition, we begin the equilibrium analysis with the case where the social network consists only of the borrower s and the lender t. Let σ be a pure strategy subgame perfect equilibrium that implements borrowing where promises are kept. In any such equilibrium, V ≤ h (s, t). To see why, assume that the borrower s defaults on the equilibrium path. Then the lender receives the transfer payment h (s, t) instead of the asset; but he must break even to lend, which yields V ≤ h (s, t). On the other hand, if the borrower returns the asset on the equilibrium path, then he must weakly prefer not to default, which again requires V ≤ h (s, t). Combining this inequality with the capacity constraint (1) then yields (2)
V ≤ c(s, t),
showing that borrowing is limited by the maximum flow in this simple network. It is also easy to see that when (2) is satisfied, there exists an equilibrium that implements borrowing: just set h (s, t) = V .10 Intuitively, the collateral value of friendship can be used to elicit payment and thus solve the agency problem. Four-Agent Network. To gain intuition about borrowing in more general networks, we next consider the network depicted 10. In this equilibrium, all surplus accumulates to the borrower because of our assumption that he proposes the transfer arrangement. In a setup where bargaining power is more evenly distributed, we expect that the surplus would be shared by the agents involved in the transfer arrangements, in a manner similar to Goyal and Vega-Redondo (2007).
1318
QUARTERLY JOURNAL OF ECONOMICS v Cousin )
c(
c(
)
c(
)
s
u
t
Borrower
Intermediary
Lender
FIGURE III Borrowing in a Four-Agent Network This figure illustrates borrowing in networks with intermediaries. The arrangement favored in our paper involves transfers flowing from s through u to t in the event of default. In this arrangement the weakest link, min[c(s, u), c(u, t)], determines the borrowing limit. An alternative arrangement, where cousin v promises to punish the borrower s in case of default, sometimes enforces better outcomes. However, this arrangement is not robust to side deals by groups of agents: the borrower and his cousin can jointly deviate, steal the asset, and short-change the lender. As we show in the text, all side deal–proof arrangements satisfy the weakest link requirement.
in Figure III, which consists of four players: the borrower s, the lender t, an intermediate agent u connecting s and t, and an agent v who is connected only to the borrower s. We will refer to v as the “cousin” of s. A natural transfer arrangement that implements borrowing in this network is one where agent u acts as an intermediary who elicits and transits payments from s to t in the case of no compliance, and gets zero net profits. To formalize this arrangement, simply set h (s, u) = h (u, t) = V . For this arrangement to be incentive compatible, the capacity constraint (1) must be satisfied for both links involved: V ≤ c (s, u) must hold so that s delivers the transfer to u, and V ≤ c (u, t) is needed to ensure that u passes on the transfer to t. Combining these yields the “weakest link” inequality (3)
V ≤ min [c (s, u) , c (u, t)] ,
which implies that the maximum flow determines the borrowing limit in this transfer arrangement. However, networks with more than two agents generally admit other subgame perfect equilibria that can implement borrowing even if (3) fails. We argue that these equilibria are implausible, because they fail a natural coalition-proofness requirement. To illustrate, assume that the borrower s has a strong link to his cousin
TRUST AND SOCIAL COLLATERAL
1319
v, with a capacity value of c (s, v ) = V + 1. The borrower might then propose an informal arrangement in which he promises to pay his cousin a transfer of h (s, v ) = V + 1 in case he fails to return the asset. This arrangement provides the right incentives to the borrower, and is a subgame perfect equilibrium even if (3) fails. To understand its logic, note that in this arrangement, the borrower essentially makes the following proposal to the lender: “Lend me your asset; if I don’t return it to you, my cousin will be angry with me.” As this interpretation makes it clear, this borrowing arrangement may not be robust to joint deviations where both the borrower and his cousin depart from equilibrium. More concretely, the borrower could circumvent the arrangement by entering a side deal with his cousin, in which he steals the asset and shares the proceeds with the cousin (who in equilibrium would otherwise receive nothing). Due to the possibility of such side deals, we do not find this equilibrium plausible. A similar potential equilibrium is one where the intermediate agent u provides incentives to the borrower but promises a zero transfer to the lender. In this case, the lender effectively “outsources” monitoring to the intermediary, trusting that the borrower will always return the asset rather than pay a high transfer to u. This arrangement is again open to side deals: here s and u can choose to steal the asset jointly and split the proceeds, leaving the lender with nothing. As in the equilibrium with the cousin, the possibility of a side deal arises because nobody “monitors the monitor”: the lender is not fully in control of incentives. When enforcement is outsourced to either the cousin or the intermediary, these agents can team up with the borrower and steal the asset. These examples suggest that when the borrower and other agents can agree to side deals, it may not be in the interest of the lender to provide the asset. This motivates our focus on subgame perfect equilibria that are immune to such side deals. III.D. Side Deal–Proof Equilibrium Consider the subgame starting in stage 2, after the identities of the borrower and the lender and the value of the asset are realized, and for any pure strategy σ ∈ , let Uu (σ ) denote the total utility of agent u in this subgame. We formalize the idea of a side deal as an alternative transfer arrangement h (u, v ) that s proposes to a subset of agents S ⊂ W after the original arrangement is accepted. If this side deal is accepted, agents in S are expected to make transfer payments according to h, whereas agents outside
1320
QUARTERLY JOURNAL OF ECONOMICS
S continue to make payments described by h. For the side deal to be credible to all participating agents, it must be accompanied by a proposed path of play that these agents find optimal to follow. This motivates the following definition. DEFINITION 2. A side deal with respect to a pure strategy profile σ is a set of agents S, a transfer arrangement h(u, v) for all u, v ∈ S, and a set of continuation strategies { σu | u ∈ S} proposed by s to agents in S at the end of stage 2, such that σu, σ S\u, σ−S ) ≥ Uu(σu , σ S\u, σ−S ) for all σu and all u ∈ S, (i) Uu( σ S , σ−S ) ≥ Uu (σ S , σ−S ) for all u ∈ S, and (ii) Uu ( (iii) Us ( σ S , σ−S ) > Us (σ S , σ−S ). Condition (i) says that all agents u involved in the side deal are best-responding on the new path of play, that is, that the proposed path of play is an equilibrium for all agents in S conditional on others playing their original strategies σ−S . Condition (ii) says that if any agent u ∈ S refuses to participate in the side deal, then play reverts to the original path of play given by σ . Finally, (iii) ensures that the borrower s strictly benefits from the side deal. DEFINITION 3. A pure strategy profile σ is a side deal–proof equilibrium if it is a subgame perfect equilibrium that admits no side deals. It is easy to see that this condition rules out the equilibria violating the weakest link inequality (3) in Figure III. We now turn to extend this result to general networks.11 III.E. Main Theorem We begin by formally defining the concept of network flows intuitively discussed above. DEFINITION 4. An s → t flow with respect to capacity c is a function f : G × G → R that satisfies the following: (i) Skew symmetry: f (u, v) = − f (v, u). (ii) Capacity constraints: f (u, v) ≤ c(u, v). (iii) Flow conservation: w f (u, w) = 0 unless u = s or u = t. 11. Our definition of side deal–proof equilibrium does not require side deals to be robust to further side deals. However, as the proof in Appendix I makes clear, imposing this requirement would not change any of our results: when there is a deviating side deal, there is also one that is robust to further coalitional deviations, namely the side deal implemented with a network flow.
TRUST AND SOCIAL COLLATERAL
1321
The value of a flow is the amount that leaves the borrower s, given by | f | = w f (s, w). Let T st (c) denote the maximum value among all s → t flows. THEOREM 1. There exists a side deal–proof equilibrium that implements borrowing between s and t if and only if the asset value V satisfies (4)
V ≤ T st (c).
This result states that the endogenous borrowing limit equals the value of the maximum flow between borrower s and lender t. We interpret the borrowing limit T st (c) as a measure of networkbased trust: if s can borrow more from t, it must be that t has higher trust in s. The logic of the proof of Theorem 1 is as follows. When V satisfies inequality (4), a side deal–proof equilibrium is easy to construct: by assumption, there exists an s → t flow with value V , and this flow can be used as a transfer arrangement. Flow conservation implies that all intermediate agents break even, confining their role to simply extracting and transmitting the payment V from s to t in case s fails to return the asset. Thus the lender is in full control of incentives; because of this, the equilibrium is easily seen to be side deal–proof. To show that no side deal–proof equilibrium can implement a higher level of borrowing, we build on the maximum flow– minimum cut theorem (Ford and Fulkerson 1956), which states that the maximum network flow between s and t equals the value of the minimum cut. A cut is a disjoint partition of the nodes into two sets G = S ∪ T such that s ∈ S and t ∈ T , and the value of the cut is defined as the sum of c (u, v ) for all links such that u ∈ S and v ∈ T . For any borrowing arrangement violating (4), we can construct a side deal in the following way. Fix a minimum cut (S, T ); the maximum flow–minimum cut theorem implies that the total capacity of all links between S and T is less than V . But then agents in S have a profitable side deal: by defecting as a group, they lose less than V in foregone friendships, but gain V from selling the asset. For a concrete example, consider the network with the cousin in Figure III and suppose that c (s, u) < c (u, t). The minimum cut between s and t has value c (s, u), and the corresponding partition is simply S = (s, v ) and T = (u, t). In any equilibrium where V > c (s, u), that is, where (4) is violated, agents in
1322
QUARTERLY JOURNAL OF ECONOMICS A. Original network
3
B. Auxiliary network
4 s1
s
2
1 v
4
3
u
t
3.5
s2
3
u
2 2
1
t
v
FIGURE IV Maximum Network Flow with Transfer Constraints This figure illustrates network flow with transfer constraints. Agent s could normally borrow an asset up to value 4 from agent t. However, he faces a binding transfer constraint of 3.5. We can calculate network flow in the constrained graph by drawing an auxiliary network where we split s into two agents s1 and s2 . All incoming links of agent s are connected to s1 and all outgoing links emanate from agent s2 . A directed link from s1 to s2 has capacity equal to the transfer constraint. The network flow from s1 to agent t equals the maximum network flow with transfer constraint.
S have a side deal: the borrower s and his cousin v can team up to steal the asset, because their total repayment is limited by the value of the cut c (s, u). III.F. Extensions: Transfer Constraints and Endogenous Circle of Trust Transfer Constraints. In environments with credit constraints, agents might have limits on the total amount they can borrow or transfer. For example, in Figure IVA, the intermediaries u and v might worry that if the borrower s carried too large a debt burden, he would be unable to pay. We show that the concept of network flows can be used to characterize borrowing in this environment as well. To introduce borrowing and transfer constraints in a simple way, suppose that each agent u can make a total payment of at most ku to others in the network, where the transfer constraints ku are exogenous. Here ku can represent either cash or time constraints.12 How much borrowing can be implemented in this environment? We show that the answer is given by the maximum flow in a modification of the social network, where each agent u is replaced by two identical agents connected by a link with capacity 12. For an intermediate agent (but not for the borrower), incoming transfers may help relax cash constraints. For these intermediate agents, ku represents constraints that remain after incoming payments; for example, these could be time constraints if the transfers were in-kind services, such as helping out, which cannot be easily passed on.
TRUST AND SOCIAL COLLATERAL
1323
ku. To formally construct this auxiliary (directed) network G , replace each node u in G with a pair of nodes, u1 and u2 , and replace each (u, v) link with two new directed links, a u2 → v1 link and a v2 → u1 link, each with capacity equal to c(u, v). Finally, for each agent u, create a new u1 → u2 link with capacity equal to the transfer constraint c(u1 , u2 ) = ku. That is, we duplicate all agents u, point all incoming links of u to u1 , have all outgoing links of u originate in u2 , and let the capacity of the u1 → u2 link be ku. For example, consider the network in Figure IVA, where agent s faces a binding transfer constraint of 3.5. The corresponding auxiliary network is drawn in Figure IVB and we can deduce that the constrained network flow equals 3.5, the flow from agent s1 to agent t in the auxiliary graph. In Appendix I we show that in any side deal–proof equilibrium where promises are kept, the borrowing limit in the presence of transfer constraints equals the value of the maximum s1 → t1 flow in G . To understand the intuition, consider a maximal flow. As in the basic model, the amounts assigned to links between agents by this flow can be interpreted as the transfer payments in a candidate transfer arrangement. It remains to verify that, in this arrangement, no agent u exceeds his total transfer constraint ku. But this follows by construction of G . The total transfers promised by u must be equal to the flow leaving u2 in G ; but by flow conservation, this must be equal to the value carried over the u1 → u2 link, which is bounded by the link capacity of ku in G . Circle of Trust. We can endogenize the “circle of trust,” that is, the set of permissible links over which transfer arrangements can be proposed, by assuming that there is a fixed cost associated with proposing various transfer arrangements. For each subgraph G0 ⊆ G, let κ(G0 ) ≥ 0 denote the cost of a transfer arrangement that includes all links in G0 .13 Assume that κ is monotone in the sense that if G0 ⊆ G 0 then κ(G0 ) ≤ κ(G 0 ). The function κ can be interpreted as a characteristic of the community’s social norm; for example, in a kin-based society, we expect κ to be zero or small for most family and relative links. Agent s, who wishes to borrow V from t, must now solve the cost-minimization problem min{κ(G0 )|G0 ⊆ G such that TGst0 ≥ V }, where TGst0 is the trust flow between s and t in G0 . The solution 13. For two networks G = (W, E) and G = (W , E ), we say that G ⊆ G if W ⊆ W and E ⊆ E.
1324
QUARTERLY JOURNAL OF ECONOMICS
G∗0 , if it exists, is the minimum cost subgraph where borrowing V can still be supported. Agent s then chooses to borrow if and only if his profit from the loan exceeds the cost, that is, ω(V ) ≥ κ(G∗0 ). Besides its added flexibility, this framework also yields two new implications. (1) The set of people involved in an arrangement is endogenously determined: the greater the profits ω(V ), the more the borrower is willing to extend his circle of trust.14 (2) With positive κ, agents only borrow when profits are high enough; assets that generate low returns are never secured through social collateral. IV. APPLICATIONS IV.A. Network Structure and Welfare We now explore how the network structure affects the payoffs from borrowing in the social collateral model. Because the network is completely summarized by the vector of capacities c, the borrowing limit T st (c) can be viewed as a “trust map” that determines, as a function of the network structure c, how much trust is created between s and t. To see how trust determines payoffs, let st (c) denote the expected payoff of s from borrowing, conditional on the lender being agent t; then z (5) st (c) = (T st (c)), where (z) = ω(v) dF(v), 0
because the payoff is just the expectation of ω(V ) over all values of V that do not exceed the borrowing limit T st (c). Changes in the network affect the payoffs through changes in the trust flow T st (c). Our goal in this section is to characterize these welfare effects.15 Monotonicity. We first explore the effect of increasing connectivity by adding new links or strengthening existing links. We say that the network associated with capacity c1 is more strongly connected than that associated with c2 if no link has lower capacity under c1 than under c2 ; that is, c1 (u, v) ≥ c2 (u, v) for all u, v ∈ W. We then have the following monotonicity result. 14. Formally, an increase in ω(V ) holding fixed V can change the sign of ω(V ) − κ(G∗0 ) from negative to positive and induce borrowing. 15. Besides the profit from borrowing st (c), the borrower s also derives utility from his friends. In the subsequent analysis we focus on the payoff from borrowing.
TRUST AND SOCIAL COLLATERAL
1325
PROPOSITION 1. If the social network with capacity c1 is more strongly connected than the network with capacity c2 , then for any borrower s and lender t, both trust and payoffs are higher: T st (c1 ) ≥ T st (c1 ) and st (c1 ) ≥ st (c1 ). Networks with more and stronger links generate more trust and higher payoffs due to the increased supply of social collateral. A large body of work in sociology relies on the result formalized here: Putnam’s (1995), for example, argues that “networks of civic engagement (. . . ) encourage the emergence of social trust.” The fact that this monotonicity emerges naturally in the social collateral model makes it a useful candidate for exploring other questions related to network-based trust. Closure and Structural Holes. We now turn to study how the deeper structure of the network affects payoffs, focusing on changes in network closure, a concept often discussed in the sociology literature. Networks have high closure if the neighborhoods of connected agents have a large overlap. To illustrate, consider the two network neighborhoods of agent s in Figure V, which is a small variation of Figure 1 in Coleman (1988). The neighborhood of s in Figure VB has higher closure, because the friends of s are directly connected. This idea of closure can also be formulated using network paths: a neighborhood has high closure if it connects s to few others through many paths (as in Figure VB), whereas it has low closure if it connects s to many others through fewer paths each (Figure VA). The sociological literature has two views about the benefits of closure. One view, dating back to Coleman (1988), argues that high closure is good because it facilitates sanctions, making it easier for individuals to trust each other. In his discussion of the wholesale diamond market in New York City, Coleman explains that “If any member of this community defected through substituting other stones or stealing stones in his temporary possession, he would lose family, religious and community ties.” Similarly, in the context of Figure V, Coleman argues that in the high closure network of Figure VB, agents t1 and t2 can “combine to provide a collective sanction, or either can reward the other for sanctioning.” In contrast, Granovetter (1973) and Burt (1995) argue that loose networks with low closure lead to higher performance, because they allow agents to reach many others through the network. Burt also emphasizes the role of structural holes, that is, people who bridge otherwise disconnected networks: for example,
1326
QUARTERLY JOURNAL OF ECONOMICS
B. High closure
A. Low closure
t3
t4
t3
t4
t1
t2
t1
t2
s
s
FIGURE V Network Neighborhoods with Increasing Network Closure This figure shows network neighborhoods with increasing network closure. The two neighborhoods shown are a small variation on Figure 1 in Coleman (1988). With unit link capacities, agent s is connected through four paths to the rest of the network in both neighborhoods. In a low-value exchange environment, the neighborhood in Panel A is more attractive because it provides access to more people. In a high-value-exchange environment, the neighborhood in Panel B is more attractive, because closure allows borrowing high-valued assets from t1 and t2 .
s is a structural hole in Figure VA but not in VB. According to Burt (2000), these structural holes “broker the flow of information between people, and control the projects that bring together people from opposite sides of the hole.” A key part of this argument is that low-closure networks provide easier access to small favors, advice, information, and other resources. To explore these issues in the social collateral model, we first develop a measure of network closure, building on the idea that high closure is associated with having multiple paths to a smaller set of agents. We begin by counting the total number of paths of an agent, using the concept of network flows. Fix a network with integer-valued capacities c; then the network flow T st (c) is effectively the number of disjoint paths of unit capacity between s and t. Thus, the total path number for s is simply T s (c) = t∈W T st (c). In Figure V, s has a total of four paths in both networks; the difference in closure comes from the fact that in VA, these four paths reach four different people, whereas in VB they reach only
TRUST AND SOCIAL COLLATERAL
1327
two people, but there are two paths connecting s with either of them.16 To generalize this observation, let P s (n) denote the share of paths s has with agents to whom he has at least n paths, so that P s (2) = 0 in Figure VA and P s (2) = 1 in Figure VB.17 Clearly, P s (0) = 1 always, and P s (n) is nonincreasing in n. DEFINITION 5. The network neighborhood of s has a higher closure than the neighborhood of s if
(i) T s (c) = T s (c) so that s and s have the same total number of paths; and
(ii) For each n, P s (n) ≥ P s (n), so that a greater share of paths connect s to people with whom he has many paths. These conditions imply that if the neighborhood of s has higher closure, then s is connected to fewer people through many paths.18 This definition allows us to compare high- and low-closure neighborhoods. The key theoretical insight is that higher closure increases trust but reduces access. For example, in Figure VB, two people trust s with assets of value V ≤ 2; although access is low, trust is high in this closed network. In contrast, in Figure VA, s can borrow from four people, but the asset value can be at most 1: access has increased, but at the cost of a reduction in pairwise trust. Due to this trade-off, whether high or low closure is associated with greater welfare depends on what assets are exchanged: trust is more important for high-value assets whereas access matters more for low-value assets. To formalize this trade-off between access and pairwise trust, we let f (v ) denote the density of F (v ) and let ω(V ) = f (V )ω (V ), the frequency-weighted profits from the ability to borrow V . Note that ω(V ) depends both on the probability that an asset of value V is needed ( f (V )), and on the profits this asset generates (ω (V )). We say that the economy is a high-value exchange environment if ω(V ) is increasing: in this case high-value transactions generate greater welfare ω(V ), either because they are more likely or because they are more productive. Conversely, we 16. To see why s has four paths in Figure VB, note that there are two paths connecting s to t1 , the direct one and the indirect one through t2 ; and similarly, two paths connect s to t2 . 17. If arrangements are limited by a circle of trust, then T s (c) and P s (n) need to be computed in the corresponding subgraph of permissible links. 18. Also note that (ii) is equivalent to requiring that the cumulative distribu
tion function 1 − P s (·) first-order stochastically dominates 1 − P s (·).
1328
QUARTERLY JOURNAL OF ECONOMICS
say we are in a low-value exchange environment when ω(V ) is decreasing. PROPOSITION 2. In a high-value exchange environment, a neighborhood with higher closure leads to a higher expected payoff to s. Conversely, in a low-value exchange environment, a neighborhood with higher closure leads to a lower expected payoff to s. In a low-value exchange environment, the access provided by low closure is more attractive, because knowing more people directly or indirectly increases the likelihood that s can obtain a low-value asset. This logic is in line with Granovetter’s and Burt’s basic argument about the strength of weak ties and the benefits of a dispersed social network in providing access to assets with low moral hazard, such as small favors, information, or advice.19 In contrast, in a high-value exchange environment, closure is better. Here, a reduction in access is more than compensated for by the fact that, through his dense connections, s will be able to borrow even high-value assets. This finding parallels Coleman’s general argument for network closure, and particularly his example of the wholesale diamond market in New York City, where the exchange of valuable stones requires high trust between dealers.20 The results of Proposition 2 are related to Putnam’s (2000) concepts of bridging and bonding social capital. In Putnam’s view, bonding social capital is associated with dense social networks and is good for generating reciprocity between agents who know each other well. In contrast, the networks underlying bridging social capital are “outward looking and encompass people across diverse social cleavages,” and are good for “linkage to external assets and for information diffusion.” These two concepts parallel our distinction between trust and access; our results thus provide formal foundations as well as network-based measures for bonding and bridging social capital. Community Size and Network Closure. What determines network closure? In Allcott et al. (2007), we argue that in practice, community size should be an important determinant. The 19. Section IV.B develops a variant of our basic setup where exchange of information is explicitly modeled. 20. Vega-Redondo (2006) reports a related finding in a model of repeated games played in networks. He shows that stability of cooperative behavior depends on a certain measure of network cohesiveness.
1329
0
0.2
Network closure (P s (2)) 0.4 0.6 0.8
1
TRUST AND SOCIAL COLLATERAL
0
5
10 Number of friends
Below median size
15
20
Above median size
FIGURE VI Community Size and Network Closure The figure is taken from Allcott et al. (2007). The figure plots average network closure for students by the number of their friends for schools below median size (solid line) and schools above median size (dashed line). For each student s, closure is measured as P s (2), the share of paths s has to others with whom he has at least two paths, within the circle of trust that includes links up to distance 2 from s. See Definition 5 and Section IV.A for details. The figure is constructed using data from 142 U.S. middle and high schools in the National Longitudinal Study of Adolescent Health; observations with number of friends greater than 19 were excluded (less than 1% of total).
intuition is straightforward: in a small community, the pool of potential friends is limited, which makes it more likely that two agents share common friends. In Allcott et al. (2007), we confirm this intuition using data on the social networks of students in the National Longitudinal Study of Adolescent Health (AddHealth).21 Normalizing all link capacities to unity, we build on Definition 5 to measure the closure of the network around a student s with P s (2), the share of all paths that s has that connect him or her with others with whom he or she is connected through at least two paths.22 This quantity is always between zero and one, and higher values represent more closed networks. Figure VI compares this measure 21. AddHealth is a representative sample of 142 U.S. public and private middle and high schools in 1994 and 1995. 22. We also restrict the “circle of trust” to links that are within distance 2 from agent s. The distance of a link (u, v) from s is the arithmetic average of the length of the shortest paths connecting s to u and s to v.
1330
QUARTERLY JOURNAL OF ECONOMICS
of closure for schools below and above the median size, for each possible value of a student’s number of friends. This figure confirms that community size is an important predictor of closure in practice: even holding fixed a student’s number of friends, smaller communities exhibit higher network closure. Implications for Organizations. The connection between community size and closure, combined with Proposition 2, has implications for organizational design. In environments where access to small favors such as providing information is important, communities should be larger. This can be achieved through a flat organizational structure where rank does not limit interactions. For example, academic communities in the United States have a relatively informal culture, generating a large community of researchers; this encourages the development of weak ties and creates access to ideas. In contrast, organizations where trust is important can create it by having smaller communities. For instance, the hierarchical structure of armies limits interactions to peers of the same rank, creating networks with high closure and bonding social capital. Our results also help explain the empirical fact that community size is often negatively correlated with prosocial behaviors such as volunteering, work on public projects, and helping friends (Putnam 2000). The traditional explanation is that in large communities people have fewer friends (Jacobs 1993). Our results suggest that even controlling for the number of friends, large communities have less dense social networks, which limits the provision of valuable public goods. IV.B. Job Search and Trust in Recommendations Sociologists have long recognized the importance of networks for finding jobs. For example, in Getting a Job, Granovetter (1974) documents that 56% of his sample of white-collar workers found employment through personal contacts. One possible explanation is that information about job openings often travels through friends and acquaintances. This logic forms the basis of Granovetter’s (1973) “strength of weak ties” theory, formally modeled by Calvo-Armengol and Jackson (2004), which predicts that weak links to agents with whom one has few common friends are most useful for job search, because they provide access to otherwise unobtainable information. However, the evidence about the strength of weak ties is mixed. Studies in U.S. cities (Bridges and
TRUST AND SOCIAL COLLATERAL
1331
Villemez 1986; Marsden and Hurlbert 1988) find that both weak and strong ties are important for job search. In Japan, Watanabe (1987) documents that small business employers screen applicants using strong ties. In China, Bian (1997, 1999) argues that the guanxi system of personal relationships allocates jobs using strong ties and paths. Granovetter (1974) provides a second reason for the importance of connections: networks can generate trust in job recommendations. When there is asymmetric information about the skills of job candidates, offers are often made based on the opinions of trusted recommenders. In Granovetter’s sample, such trusted referrals are common: in 60% of all jobs obtained through a network path of length 2 or more, the worker’s direct contact had “put in a good word” for him. Because trusted referrals are more likely to come through strong ties, this logic can help explain why many empirical studies have found strong ties to be more important. We now explore the implications of network-based trust for job search using the social collateral model.23 Consider an employer t who needs to fill a vacancy. Potential employees are either high or low types; if hired, a high type generates total value SH and a low type generates SL, where SH > SL > 0. In the formal labor market, worker types are unobservable, the proportion of high types is π H , and the prevailing market wage rate is w. Thus, hiring from the labor market generates an expected surplus S = π H SH + (1 − π H ) SL, of which S − w accumulates to the employer. However, the employer may be able to hire a known high type through his social network. If s is a high-type job candidate, and his type can be credibly communicated to the employer, then the surplus from hiring s versus hiring from the formal labor market is SH − S. Assuming that this surplus is divided by Nash bargaining, where the bargaining weight of the worker is α, the wage of s if hired is w H = w + α · (SH − S), and the excess profit of the firm relative to hiring from the labor market is (1 − α) · (SH − S). Can the network credibly communicate the worker’s type to the employer? To answer, assume that the type of worker s is only observed by himself and his direct friends, denoted s1 , . . . , sk. Although these friends can, in principle, provide recommendations, 23. Saloner (1985) and Simon and Warner (1992) also study informal recommendations in labor markets. These papers set aside trust considerations by assuming that recommenders and firms have the same objective.
1332
QUARTERLY JOURNAL OF ECONOMICS
they face a moral hazard problem: a low-type worker s can bribe them to write good recommendations. Here bribes are interpreted broadly to include in-kind transfers, as well as being nice to the recommender. The amount candidate s is willing to spend on bribes is limited by the attractiveness of the job, α · (SH − S); if he or she offers more, the bribes would exceed the profit from getting the job. This reasoning suggests that the network can only communicate worker type in a credible way when the employer’s trust of recommenders, s1 , . . . , sk exceeds the highest bribe that the worker can pay, α · (SH − S). To formalize these ideas, we modify the basic model as follows. First, we assume that prior to sending recommendations, agents agree on an informal transfer arrangement that is to be activated if the worker turns out to be a low type. This arrangement represents the understanding that recommenders will be held responsible for bad recommendations. Second, we introduce the concept of side deals with bribes, where agent s might propose a new transfer arrangement, together with a set of bribes to be paid to his friends, s1 , . . . , sk, in exchange for their good recommendations.24 Finally, we introduce an auxiliary network, G∞ , where links between s and his friends, s1 , . . . , sk, have infinite capacity, st (c) denotes the trust flow between s and t in this network. and T PROPOSITION 3. In an equilibrium robust to side deals with bribes, low-type workers are never hired through the network. If st (c) ≥ α · (SH − S), there exists an equilibrium and only if T robust to side deals with bribes where a high-type worker s is hired. The result simply states that when network-based trust between the employer and recommenders exceeds the sensitivity of profits to worker type, as measured by the term α · (SH − S), the true type of the worker can be credibly communicated. Several implications about networks and labor markets follow. (1) Networkbased trust should be more important for high-skilled jobs, where the employer’s profits are more sensitive to worker type. Proposition 2 then predicts a trade-off between weak and strong ties: for low-skill jobs, where type matters less, weak connections are best because they maximize access; but for high-skilled jobs, recommendations through strong links embedded in a dense network are more useful. (2) Jobs obtained through the network should 24. The formal details of these modifications are presented in Appendix I.
TRUST AND SOCIAL COLLATERAL
1333
earn higher wages than jobs obtained in the market. Simon and Warner (1992) obtain the same prediction, but their mechanism is different: in their work, networks reduce uncertainty about the quality of the match, increasing the reservation wage; in contrast, in our model only high types are hired through the network. (3) Due to the increased importance of trust for high-quality jobs, the wage differential between network-based and market-based hires, w H − w = α · (SH − S), should be positively related to skill intensity. (4) When filling high-skill vacancies, employers should search more through their networks. These predictions are consistent with several empirical facts. The first prediction helps explain the mixed evidence about the strength of weak ties by showing that for many jobs strong ties should be more important; it also implies that the strength of weak ties should vary with the skill intensity of the job, a prediction that awaits empirical testing. Consistent with the second prediction, Granovetter (1974) reports that in his sample, “jobs offering the highest salary are much more prone to be found through contacts than others: whereas less than half of jobs yielding less than $10,000 per year were found by contacts, the figure is more than three-quarters for those paying more than $25,000.” This positive correlation between referrals and salary is also confirmed by Gorcoran, Datcher, and Duncan (1980) and Simon and Warner (1992). Regarding the intensity of network search, Brown (1967) finds that among college professors, personal networks are more frequently used in obtaining jobs of higher rank, smaller teaching loads, and higher salaries and at more prestigious colleges. For these attractive jobs, reducing asymmetric information is likely to be more important, and hence, employers have a stronger preference for searching through their networks. Our predictions would not emerge in a model where the network served purely as a source of information about job vacancies. In such an economy, the network does not reduce information asymmetries; hence the wage differential is zero and the importance of network-based recommendations does not vary with the type of the job. Our results thus suggest that a full analysis of networks in labor markets should incorporate both information transmission and trust in recommendations. Trust and Asymmetric Information. The social collateral model can also be used to study other situations involving asymmetric information. For example, a simple alteration of our job
1334
QUARTERLY JOURNAL OF ECONOMICS
search framework shows that network-based recommendations can help identify whether a given borrower is intrinsically a trustworthy type.25 A similar logic applies for transactions of valuable assets such as houses, which involve a potential “lemons” problem: sellers with whom the buyer has a high trust flow are more likely to be honest about the quality of the good, to avoid future retribution through social sanctions.26 We conclude that the implications of social collateral in the presence of asymmetric information are similar to the basic model with moral hazard: higher trust flow can secure transactions where there is greater exposure to asymmetric information. V. MEASURING SOCIAL COLLATERAL IN PERU We now empirically evaluate the social collateral model using a unique data set from two low-income Peruvian shantytown communities, collected by Dean Karlan, Markus Mobius, and Tanya Rosenblat, further described in Karlan et al. (2008). Two key features of these data make them particularly useful for our purposes: (1) information on the social networks of individuals and (2) data on informal loans between friends, relatives, and acquaintances. V.A. Data Description In 2005, a survey was conducted in two communities located in the Northern Cone of Lima. The heads of households and spouses (if available) in 299 households were interviewed. The survey consisted of two components: a household survey and a social network survey. The household survey recorded a list of all members of the household and basic demographic characteristics, including sex, education, occupation, and income; summary statistics for these variables are reported in Table I. Average monthly household income in the two communities was 957 and 840 Peruvian new soles (S/.), respectively, which equals approximately 294 and 258 US$, using the exchange rate in 2005. The social network component of the survey asked the household head and spouse to list up to ten individuals in the community 25. Karlan (2005) documents evidence that there is variation in individuals’ trustworthiness, which is predictive of their financial behavior. 26. In line with this prediction, in the 1996 General Social Survey, 40% of home purchases and 44% of used car purchases involved a direct or indirect network connection between the buyer and seller or realtor (DiMaggio and Louch 1998).
1335
TRUST AND SOCIAL COLLATERAL TABLE I SUMMARY STATISTICS FOR TWO SHANTYTOWN COMMUNITIES IN PERU Demographic variables
Mean
Female 0.50 Age 35.84 Secondary ed. 0.71 Household inc.(S/.) 887.39 Business-owner 0.20
Standard dev. 0.50 14.37 0.21 1,215.74 0.40
Social network variables Number of contacts Share of “neighbors” Share of “friends” Share of “relatives” Avg. size of loan (S/.) Geographic dist.
Mean
Standard dev.
8.60 4.15 0.59 0.49 0.39 0.49 0.02 0.15 75.88 121.20 41.16 49.17
Note. The table shows summary statistics for adults (age at least 18). Income and loan amounts are reported in Peruvian new soles (S/.). The exchange rate at time of the survey was 3.25 S/. for one US$. Network variables are calculated for the nondirected network where a pair of individuals are classified as connected if one of them names the other as a friend. Geographic distance is reported in meters.
with whom the respondent spent the most time in an average week. We use this data to construct an undirected “OR”-network, where two agents have a link if one of them names the other. Agents have, on average, 8.6 links, and the average geographic distance between connected agents is 42 and 39 m in the two communities; this is considerably less than the geographic distance between two randomly selected addresses, which is 132 and 107 m, respectively.27 About 59% of relationships were classified by respondents as “vecino” (neighbor) and 39% as “amigo” or “compadre” (friend). The share of “relativos” was just 2%.28 Vecinos live slightly closer than amigos/compadres (35 versus 51 m). Over 90% of directly connected people met in the neighborhood for the first time. Importantly for our purposes, the social network survey also recorded, for each responder, the set of friends from whom he or she had borrowed money during the previous twelve months. There were 254 informal loans in the data set; 167 borrowers in 138 households reported having borrowed on average 76 S/. (about 23 US$) from 173 lenders during the past twelve months. Thus, informal borrowing is very common in these communities: 46% of all households have at least one household member who borrowed money in this manner. The mean age of both the borrower and the lender is 39 years and they live, on average, 36 m apart. 27. This is consistent with a body of work showing the importance of social distance in meeting friends, for example, Marmaros and Sacerdote (2006). 28. In the remainder of this section, we use the term “friend” for any network connection, whether vecino, amigo/compadre, or relativo.
1336
QUARTERLY JOURNAL OF ECONOMICS
V.B. Empirical Framework Measuring Capacities and Trust Flow. To adapt our model of social collateral to this empirical setting, we need to develop a measure of link capacity. We use the amount of time spent together as a proxy for the strength of a connection, capturing the intuition that link values depend on investment in joint social activity. In the data, the distribution of time spent together is skewed: the average responder spends less than six minutes with the bottom 10% of his/her friends and more than three hours with the top 10%. To obtain a more homogeneous measure, we define normalized time for two connected agents u and v as the value, for the amount of time they spend together, of the empirical cumulative distribution function of time spent together in their community. With this definition, the empirical distribution of normalized time τ (u, v ) across all connected pairs is a discretized uniform distribution on the unit interval in each community. We assume that link capacities are created by an increasing production function g such that c(u, v) = g(τ (u, v)); that is, spending more time together results in stronger links. We compute the network flow between agents s and t by defining the circle of trust to be the subgraph that contains all links of s and t. This circle of trust allows a simple decomposition of the trust flow between s and t as g(min(τ (s, v), τ (v, t))), (6) T st (c) = g(τ (s, t)) + v∈Ns ∩Nt
where the first term represents the direct flow and the second term is the indirect flow. Here Ns is the set of direct friends of agent s. Discrete Choice Framework. A natural approach to estimating the social collateral model is to use observations on how much agents borrow, and to use the loan size as a lower bound for the trust flow. This approach runs into the difficulty that loan amounts are also affected by demand: a borrower might borrow less than the trust flow. To avoid explicitly modeling loan demand, we instead base our estimation on who the agent borrows from, exploiting the idea that people are more likely to borrow from friends who trust them. By conditioning on the borrower, this approach effectively controls for loan demand as a fixed effect. We formulate the borrower’s choice of lender as a discrete choice problem. Consider agent s, who is in need of a loan of size
TRUST AND SOCIAL COLLATERAL
1337
V , which he can borrow from potential lenders t1 , . . . , tk. We write the total utility that s enjoys when he borrows from a particular lender t as (7)
ut = u(V, T st (c) + εt ),
where u is increasing and εt represents either measurement error in the trust between s and t, or a supply shock. Appendix II provides micro foundations for this representation by assuming that if V exceeds the level of trust T st , the excess value must be secured using physical collateral that has some opportunity cost. Then, the borrower is more likely to turn to a lender who trusts him more, implying that (8)
preferred lender = arg max[T st (c) + εt ], t
because, conditional on the loan amount, (7) is maximized when trust is highest. Model Predictions. We use the above discrete choice specification to test three predictions of the social collateral model. (1) Agents are more likely to borrow from friends with whom they have a stronger trust flow. This prediction is a direct implication of Theorem 1. (2) The contribution of an indirect path of a given strength is equal to the contribution of a direct link with the same strength. This prediction is made because there are no costs to including intermediate agents within the circle of trust in the borrowing arrangement. In a setup where the circle of trust is endogenized, as in Section III.F, the contribution of indirect paths would be smaller, but still positive. (3) Each indirect path contributes to borrowing through its weakest link. In particular, in decomposition (6), for each indirect s → v → t path, if we have τ (s, v) < τ (v, t), then the contribution of the path to borrowing should only depend on τ (s, v). Some of these predictions are consistent with alternative explanations. Time spent together can be correlated with the strength of altruistic feelings between the two agents and the ease with which information travels between them. Common friends can further strenghten altruism and information transmission. Trust flow can therefore be a proxy for the lender’s altruism toward the borrower and the lender’s ability to learn about the profitability of the borrower’s project. There is no particular reason that in these alternative explanations the weakest link
1338
–0.2
Excess propensity to borrow –0.1 0 0.1
0.2
QUARTERLY JOURNAL OF ECONOMICS
–2
0 1 –1 Excess trust flow (measured as time flow)
2
FIGURE VII Trust Flow and Borrowing This figure is a residual plot, controlling for borrower fixed effects, of the relationship between trust flow, measured as time flow, and borrowing, where time flow is the sum of direct and indirect normalized time spent together. The figure is constructed as follows. For each borrower we calculate mean trust flow with all his or her friends, and define excess trust flow as the deviation from this mean. We similarly construct excess borrowing as the deviation from the average probability of borrowing across all friends. We sort all borrower–lender pairs by excess trust flow, group them into sixteen equal-sized bins, and plot the excess probability of borrowing (vertical axis) against the average excess trust flow (horizontal axis) for each bin.
should determine the strength of altruistic feelings or the strength of information transmission.29 However, without better data, we cannot completely exclude these alternative explanations. V.C. Results Graphical Analysis. We begin with a graphical analysis of trust flow and borrowing to highlight the basic patterns in the data. Assume that the strength of a link is proportional to normalized time: c (u, v ) = c · τ (u, v ). Then trust flow T st can be written as c · τ st , where τ st measures the total (direct plus indirect) “time flow” between agents s and t, computed using equation (6). Figure VII depicts the relationship between trust flow and borrowing in our sample, conditioned on borrower-specific fixed effects. The construction of the figure is the following. We introduce 29. One concrete model of altruism is where the lender cares about the utility of the intermediary who cares about the utility of the borrower. This model predicts that a geometric average of the two link values determines borrowing, which contradicts the weakest link condition of prediction 3. Similarly, if networks matter purely because they transmit information, then the average and not the minimum of link values should determine borrowing.
1339
TRUST AND SOCIAL COLLATERAL TABLE II BORROWING AS A FUNCTION OF DIRECT AND INDIRECT FLOW Direct time Indirect time
Above average
Below average
Above average
21.0%
42.0%
Below average
14.5%
22.5%
Note. This table shows the role of indirect paths in borrowing. Direct and indirect trust flow are computed as direct and indirect normalized time flow for each borrower and lender pair (see the notes to Figure VI or the text for details.) The construction of the table is as follows. We compute mean direct and indirect flow for each borrower by averaging across his/her friends, and create two indicator variables for whether direct and indirect flow is above or below the average. The table shows how loans are distributed across the resulting four bins (direct flow below or above average × indirect flow below or above average).
an indicator variable Ist , which is one if we observe s borrowing from t. For each borrower s we calculate the mean time τ s he or she spends with her friends, and the share I s of friends he or she borrows from. We then define the borrower’s “excess time flow” with lender t as τ st − τ s , and his or her “excess borrowing” from t by Ist − I s . Figure VII is simply a plot of excess borrowing against excess time flow, where observations are averaged over intervals of excess time flow to smooth out all uncorrelated noise. The figure shows a strong positive relationship, confirming the basic prediction that agents should be more likely to borrow from friends who trust them. Figure VII does not distinguish between direct and indirect flows. To get a sense of the relative contribution of indirect paths, in Table II we group all friends of each borrower into four categories along two dimensions: whether the direct flow between borrower and friend is below or above the average direct flow, and whether the indirect flow between borrower and friend is below or above the average indirect flow. We then calculate the share of loans that fall into each of the resulting four categories. About 14.5 percent of loans involve borrower/lender pairs with both below-average direct flow and below-average indirect flow. Almost double as many loans involve borrower/lender pairs with either above-average direct or above-average indirect flow. About three times as many loans involve borrowers and lenders with both above-average direct and above-average indirect flow. Indirect paths appear to play an important role in creating social collateral for borrowing. Structural Estimation. To analyze the relationship between trust flow and borrowing in greater detail, we now estimate the
1340
QUARTERLY JOURNAL OF ECONOMICS
discrete choice model (8). This allows us to measure the relative strength of different network links, as well as to formally test our predictions. We allow capacities to depend on the time spent together in a flexible way, by classifying every link as weak, medium, or strong, depending on whether the time spent together lies in the lowest, medium, or highest third of the time distribution for each of the two communities. Each direct and indirect path between borrower and lender then makes a weak, medium, or strong contribution to total flow, where the strength of these different link types is measured by unknown parameters cW , c M , and c S . Given our definition of the circle of trust, the trust flow T st (c) between s and t, as given by (6), is easily seen to be a linear function of c = (cW , c M , c S ). Assuming that the error term ε has the extreme value distribution, we can then estimate (8) as a conditional logit, (9)
Pr[lender is t] =
exp[(1/λ) · T st (c)] , su u∈Ns exp[(1/λ) · T (c)]
where λ > 0 measures the relative importance of the error term. Given the linearity of T st in c, the unobserved parameters λ and c cannot be separately identified, but we can use the estimates to back out capacity ratios like c S /c M . Table III reports our logit estimates. The first column contains our baseline specification; the coefficient estimates for total weak, medium, and strong flow correspond to cW /λ, c M /λ and c S /λ in the estimating equation. The effect of weak paths on borrowing is insignificant and small: gaining access to lenders through weak ties appears to be relatively less important for obtaining loans. Both medium and strong paths have a highly significant positive effect on borrowing, and the effect of strong paths is significantly greater. One additional medium path to a lender increases the probability of borrowing by a factor of 1.44, whereas an additional strong path increases the probability by a factor of 2.7. The ratio of the point estimates implies that the capacity of strong links is about three times as high as that of medium links: c S /c M ≈ 2.7. These results support prediction 1, that trust flow should be positively related to borrowing, and highlight the importance of strong ties. Is the contribution of an indirect path different from that of a direct path? To compare indirect and direct paths, in column (2) we add the number of indirect medium and strong paths as separate controls in the regression. According to our second prediction,
1341
TRUST AND SOCIAL COLLATERAL TABLE III TRUST FLOW AND CHOICE OF LENDERS CONDITIONAL LOGIT ESTIMATES
Total weak flow (cW /λ) Total medium flow (c M /λ) Total strong flow (c S /λ)
(1)
(2)
(3)
(4)
0.16 (0.143) 0.365 (0.155)∗ 0.991 (0.163)∗∗
0.151 (0.142) 0.546 (0.266)∗ 1.317 (0.283)∗∗ omitted −.190 (0.319) −.526 (0.363)
0.142 (0.164) 0.341 (0.19) 0.988 (0.165)∗∗
0.147 (0.164) 0.543 (0.267)∗ 1.311 (0.284)∗∗ omitted −.226 (0.379) −.516 (0.368) 0.018 (0.315) 0.06 (0.34) −.006 (0.003)∗ 988
Indirect weak flow Indirect medium flow Indirect strong flow Weak–not weak flow Medium–strong flow Geographic distance Obs.
−.006 (0.003)∗ 988
−.006 (0.003)∗ 988
0.073 (0.313) 0.069 (0.27) −.006 (0.003)∗ 988
Note. Each link is classified as weak, medium, or strong depending on whether the time spent together lies in the lowest third, medium third, or highest third of the time distribution. Weak, medium, and strong total flow are defined by noting that each direct and indirect path between borrower and lender makes either a weak, medium, or strong contribution to total flow. For indirect medium and strong flow we only count indirect paths. Weak–not weak flow counts paths where exactly one link is weak; medium–strong flow counts paths where one link is medium and one link is strong. We do not include indirect weak flow in columns (2) and (4) because we cannot separately identify total and indirect weak flow in our conditional logit estimation, as every potential lender has at least a weak link to the borrower. ∗ 5% significance level. ∗∗ 1% significance level.
the coefficients of these variables should be zero. We find that the estimated coefficients on indirect flow are negative, but not statistically significant, and smaller than the corresponding coefficients on total flow. These results show that both direct and indirect paths have a substantial positive effect on borrowing, confirming the basic intuition that dense networks are better in creating social collateral. The negative estimates of indirect flows, although insignificant, suggest that the effect of indirect paths is slightly smaller, which can be explained in our model by endogenizing the circle of trust as in Section III.F. Combined with the results about strong ties, these estimates suggest that dense networks and bonding social capital are important for obtaining loans in these communities. We now test the prediction about the role of the weakest link in column (3), where we include two new explanatory variables
1342
QUARTERLY JOURNAL OF ECONOMICS
in the regression. “Weak-not weak flow” counts the number of indirect paths where one link is weak and the other is medium or strong, whereas “medium-strong flow” counts the number of paths where one link is medium and the other is strong. If prediction 3 is false, then these paths should have a positive effect on borrowing beyond what is predicted by the social collateral “weakest link” theory. The estimated coefficients on these variables are insignificant and small, providing strong evidence for the role of the weakest link in determining social collateral. These results are replicated in column (4), which includes the controls for indirect flows. Our findings about the role of indirect paths and the weakest link property help distinguish our model from other explanations for borrowing, such as altruism and information transmission. One caveat with our econometric analysis is that if time spent together increases due to borrowing, reverse causality confounds the interpretation of the estimates. Thus, the evidence supports, albeit not exclusively, the social collateral model; moreover, strong ties and network closure, that is, bonding social capital, appear to be particularly important for borrowing. Importantly, the theoretical framework provides clear predictions that can be tested in further settings, with perhaps more control over key empirical identification issues. VI. CONCLUSIONS This paper has built a model where agents use their social connections as collateral to secure informal loans. This model naturally leads to a definition of network-based trust, which we then use in applications related to network structure and welfare, trust in job search, and the measurement of social capital. We conclude by sketching three other applications of the social collateral model. VI.A. Network Statistics When informal arrangements are restricted by the circle of trust to connections within a given social distance, our model generates a family of trust measures. Our working paper, Mobius and Szeidl (2007), shows that when all links have equal capacity, these measures are functions of several commonly used network statistics, including (1) number of friends; (2) the clustering coefficient, which is a measure of local network density; (3) the number of common friends of two agents; and (4) the number of transitive triples,
TRUST AND SOCIAL COLLATERAL
1343
another measure of network density.30 These results provide social collateral-based foundations for common network statistics. VI.B. Risk Sharing Development economists often emphasize the importance of informal insurance in developing countries. Ambrus, Mobius, and Szeidl (2008) use the social collateral model to explore risksharing in networks. They find that good risk sharing requires networks to be expansive: larger sets of agents should have more connections with the rest of the community. Networks shaped by geographic proximity have this property, because agents tend to have friends at a close distance in multiple directions, helping to explain the observed good risk sharing in village environments. They also find that network-based insurance is local: socially closer agents insure each other more. VI.C. Dynamics of Trust and Panics In the basic social collateral model, link capacities are exogenous. Mobius and Szeidl (2008) show that link values can be endogenized with multiple rounds of exchange. The strength of a relationship is, then, the sum of its direct value, as in the basic model, plus the indirect value, which derives from the ability to conduct transactions through the link in the future. In this framework, fluctuations can be amplified through a network multiplier similar to the social multiplier of Glaeser, Sacerdote, and Scheinkman (2003), because trust withdrawal that constrains exchange locally can lead to further trust withdrawals that ripple through the network. New technologies that limit future social interaction, such as television, can substantially reduce trust and social capital through this mechanism.31 APPENDIX I: PROOFS DEFINITION 6. A weak flow with origin s is a function g : W × W → R with the following properties: (i) Skew symmetry: g(u, v) = −g(v, u). (ii) Capacity constraint: g(u, v) ≤ c(u, v). (iii) Weak flow conservation: w g(u, w) ≤ 0 unless u = s. 30. These measures are used, for example, in Wasserman and Faust (1994), Watts and Strogatz (1998), Glaeser et al. (2000), and Jackson (2006). 31. In related work, Kranton (1996) and Spagnolo (1999) study the interaction between social and business activities.
1344
QUARTERLY JOURNAL OF ECONOMICS
A weak flow of origin s can be thought of as taking a certain amount from node s and carrying it to various other nodes in the network. By weak flow conservation, any node other than s receives a nonnegative amount. LEMMA 1. We can decompose any weak flow g as fu, g= u∈V
where for each u, fu is an s → u flow, fu(v, w) = 0 that is, w for all v = u, v = s, and moreover w fu(u, w) = w g(u, w), that is, fu delivers the same amount to u that g does. Proof. Consider vertex u such that w g(u, w) < 0. By weak flow conservation, the amount of the flow that is left at u must be coming from s. Hence, there must be a flow fu ≤ g carrying this amount from s. With fu defined in such a way, repeat the same procedure for the weak flow g − fu with some other vertex u . After fu is defined for all vertices u, the remainder f satisfies flow conservation everywhere and can be added to any of the flows. Implicit summation notation: For a weak flow g and two vertex sets U ⊆ W and V ⊆ W, we use the notation that f (u, v). f (U, V ) = u∈U , v∈V
Proof of Theorem 1. Sufficiency. We begin by showing that when (4) holds, a side deal–proof equilibrium exists. By assumption, there exists an s → t flow with value V . For all u and v, let h (u, v ) equal the value assigned by this flow to the (u, v ) link. Now consider the strategy profile where (1) the borrowing arrangement h is proposed and accepted, (2) the borrower returns the asset, and (3) all transfers are paid if the borrower fails to return the asset. This strategy is clearly an equilibrium. To verify that it is side deal–proof, consider any side deal, and let S denote the set of agents involved. For s to be strictly better off, it must be that he prefers not returning the asset in the side deal. Now consider the ( S, T ) cut. By definition, the amount that flows through this cut under the original arrangement is V ; but then the same amount must flow through the cut in the side deal, as well. This means that s must transfer at least V in the side deal; but then he cannot be better off. More generally, this argument shows that any transfer arrangement that satisfies flow conservation is side deal–proof.
TRUST AND SOCIAL COLLATERAL
1345
Necessity. We now show that when (4) is violated, no side deal–proof equilibrium exists. We proceed by assuming to the contrary that a pure strategy side deal–proof equilibrium implements borrowing even though (4) fails. First note that on the equilibrium path, the borrower must weakly prefer not to default. To see why, suppose that the borrower chooses to default on the equilibrium path. Because the lender and all intermediate agents must at least break even, this implies that the borrower has to make a transfer payment of at least V . But then the borrower must weakly prefer not to default, because returning the asset directly has a cost of V . This also implies that all intermediate agents must have a zero payoff. By assumption, there exists an ( S, T ) cut with value c ( S, T ) < V . We now construct a side deal where all intermediate agents in S continue to get zero, but the payoff of s strictly increases. The idea is easiest to understand in an equilibrium where promises are kept, that is, when all transfers satisfy the capacity constraint h (u, v ) ≤ c (u, v ). Then, we simply construct an arrangement that satisfies flow conservation inside S and delivers to the “boundary” of S the exact amount that was promised to be carried over to T under h. More generally, when the capacity constraints fail over some links, the deviation in the side deal can result in some agents in S losing friendships with agents outside S. To compensate for this loss, the side deal must deliver to the “boundary” of S an additional amount that equals the lost friendship value. Formally, let g be a maximal s → t flow and consider the restriction of g to S. Thisis a weak flow, and by the lemma it can be decomposed as g = u∈S gu, where each gu is an s → u flow. Now for each u ∈ S, let g (u, T ) and h (u, T ) denote the amounts leaving S through u under g and h. Moreover, for each u ∈ S, let z (u, T ) denote the total friendship value lost to u in the subgame where the borrower defaults, as a consequence of unkept transfer promises. Because g is a maximum flow and ( S, T ) is a minimal cut, it follows that g (u, T ) ≥ h (u, T ) + z (u, T ). This is because any link between u and T is either represented in h (u, T ), if u pays the transfer, or z (u, T ), if u does not pay and loses the friendship. This inequality implies that, whenever h (u, T ) + z (u, T ) > 0, we also have g (u, T ) > 0. As a result, we can define h =
h(u, T ) + z (u, T ) u∈S
g (u, T )
· gu.
1346
QUARTERLY JOURNAL OF ECONOMICS
Note that h is a weak flow in S and delivers exactly h (u, T ) + z (u, T ) to all agents in S. Thus h satisfies flow conservation within S and delivers to the “boundary” of S the sum of two terms: h(u, T ), which is the precise amount to be carried over to T under h, and z (u, T ), which is the loss of friendship u suffers due to not making other promised transfers. We claim that h is a profitable side deal. First, h satisfies all capacity constraints by construction. Second, all agents in S break even under h , as they did in the original equilibrium. Third, the total value delivered by h is at most c ( S, T ) < V , which means that s pays less than V under h , whereas he pays exactly V in the original equilibrium. We have constructed a side deal in which the borrower is better off and all other players are best-responding; hence, the original equilibrium was not side deal–proof. Proof for Section III.F. Transfer Constraints. In this analysis, we use a more stringent equilibrium selection criterion: We look for equilibria where (i) all promised transfers are paid; and (ii) there are no profitable side deals. In the earlier analysis, there was no need to impose (i), because the characterization results showed that any level of borrowing that can be implemented can also be implemented using equilibria where all transfers are paid. With transfer constraints, requiring that all promises be credible has additional bite, because promises that are not credible can generate large punishment in the form of loss of friendship to agents who have small ku. We find it plausible that such agents will not make promises that they know they cannot keep, but instead of providing formal micro foundations for this, we simply restrict ourselves to equilibria that are “credible,” in the sense that all promises are kept. Consider the directed network G defined in the text and let the maximum s1 → t1 flow in G be denoted by T s1 t1 (c). PROPOSITION 4. There exists a side deal–proof equilibrium with credible promises that implements borrowing if and only if (10)
V ≤ T s1 t1 (c).
Proof. Sufficiency. If (10) holds, then take a flow with value V , and let the flow values between different agents define the transfer arrangement in our candidate equilibrium. Note that by construction, this borrowing arrangement satisfies the borrowing
TRUST AND SOCIAL COLLATERAL
1347
constraints of all agents u. Moreover, the promised transfers in this arrangement will be kept because they all satisfy the capacity constraint. It remains to be shown that there are no profitable side deals; this follows from the same argument used in the proof of Theorem 1. Necessity. Suppose that (10) fails, and consider an equilibrium where promised transfers are paid and borrowing is implemented. We now show that this equilibrium admits a side deal. Our argument is similar to the proof of Theorem 1, in that we build the side deal using a minimum cut on the network G . However, the present setup has one additional difficulty: we need to make sure that the side deal emerging from the minimum cut does not separate agents from their duplicates. Let (S , T ) be a minimum cut. If for some u = s we have u2 ∈
S , then u1 ∈ S also holds, because u2 has only one incoming link, which originates in u1 . Let S be the union of s and the collection of agents u such that u1 ∈ S . We need to show that agents in S, as a group, do not have the right incentive to return the asset. To / S . It follows see why, consider first an agent u ∈ S such that u2 ∈
that the (S , T ) cut separated u1 from u2 , by cutting the u1 → u2 link. But in this equilibrium, promises are kept, and, hence, the total obligation of u to agents outside S can be at most ku, which is exactly the value of the cut link. Next consider an agent u ∈ S such that u2 ∈ S . For this agent, the total obligations to others outside S are bounded from above by the total value of the links originating in u2 that are cut. Summing over all u ∈ S, we conclude that the total obligations of all agents in S do not exceed the value of the (S , T ) cut, and, hence, are strictly smaller than V . Thus, S, as a group, has an incentive to default. The actual side deal can now be constructed in the same way as in the proof of Theorem 1. Proof of Proposition 1. Consider two capacities c1 ≤ c2 . Any flow between s and t that is feasible under c1 is also feasible under c2 ; hence the maximum flow cannot be lower under c2 than under c1 . Proof of Proposition 2. We denote the share of total paths to agents with whom agent s has precisely j paths with qs ( j). If we treat this function as a probability density function over the nonnegative integers, then an increase in closure is equivalent to a first-order stochastic dominance shift.
1348
QUARTERLY JOURNAL OF ECONOMICS
The expected payoff of s, conditional on his being the borrower, can be written as 1 qs ( j) ( j) 1 qs ( j) ( j) = , N j j N j j which can be viewed as the expected value of the function ( j)/j under the probability density qs ( j). In a high-value exchange enω(V ) is increasing; vironment, (V ) is convex because (V ) = this, combined with the fact that (0) = 0, implies that (V )/V is nondecreasing. In this case, a first-order stochastic dominance increase in the probability density qs ( j ) increases the expected payoff by definition. An analogous argument shows that in a lowvalue exchange environment, the same increase in the sense of first-order stochastic dominance reduces the expected payoff of s. Proof of Proposition 3. Preliminaries. The timeline of the model with job search is the following. In stage 1, a set of agents, including s1 , . . . , sk and t, agree on a transfer arrangement that specifies transfers h (u, v ) to be made in the event that s1 , . . . , sk send recommendations, and s is hired and then turns out to be a low type. In stage 2, agents s1 , . . . , sk choose whether to recommend s to the employer t. In stage 3, t decides whether to hire s or not; profits are earned, and the type of s is publicly revealed. In stage 4, if needed, the transfer arrangement is executed; and in stage 5, agents consume the values of remaining links. We consider a class of coalitional deviations that we call side deals with bribes. A side deal with bribes is a new transfer arrangement proposed by s to s1 , . . . , sk and potentially some other agents at the beginning of stage 2, together with a set of bribes b1 , . . . , bk that s pays to s1 , . . . , sk in exchange for their recommendation. For simplicity, we assume that bribes are spot transactions: each agent s j sends the recommendation at the same time that he receives the bribe. We assume that when the surpluses from hiring through the network and in the market are the same, t always hires in the market. Proof. Fix a pure strategy equilibrium robust to side deals with bribes. If a low type is hired in this equilibrium, then the expected surplus from the employment relationship is S, which is the same as hiring in the formal market, and hence t never hires through the network. It follows that in equilibrium only high types
TRUST AND SOCIAL COLLATERAL
1349
are hired in the network. Now suppose that in this equilibrium st (c) < α · (SH − S) and the high-type worker is hired. Then the T low type can propose a profitable side deal with bribes. As in the proof of the main theorem, this side deal includes all agents in a minimum cut separating s from t in G∞ and transmits an amount equal to the maximum flow to agents at the boundary of the cut. The bribes in the side deal are specified to equal the amounts that flow through agents s1 , . . . , sk in this flow. It follows that all agents weakly prefer accepting the side deal: intermediate agents at least break even by flow conservation, and the friends of s all break even because the bribes exactly compensate them for the payments to be made in the side deal. This contradiction shows that in any side deal–proof equilibrium where the high type is hired, we must st (c) ≤ α · (SH − S). Finally, if this inequality holds, then have T the transfer arrangement specified by the maximum flow in G∞ is easily seen to be an equilibrium robust to side deals with bribes. APPENDIX II: MICRO FOUNDATIONS FOR SOCIAL SANCTIONS In this Appendix, we develop a model where punishment at the level of the link arises endogenously. There are three key changes relative to the model presented in the main text: (1) with probability p > 0, the asset disappears, for example, is stolen by a third party, after the borrower uses it. (2) Each link “goes bad” with a small probability ε during the model, capturing the idea that friendships can disappear for exogenous reasons. (3) The utility of friendship is modeled using a “friendship game” where agents can choose to interact or stay away from each other. The payoffs of this friendship game depend on the capacity of the link and on whether the link has gone bad. A. Model Setup This model consists of the following six stages: Stage 1: Realization of Needs. Identical to stage 1 in Section III. Stage 2: Borrowing Arrangement. In this model, there is uncertainty about whether the asset disappears after being used. As a result, the arrangement is now a set of state-contingent payments, where the publicly observable state of the world i is either i = 0, if the asset is returned, or i = 1, if the asset is reported stolen. A borrowing agreement consists of two parts. (1) A contract specifying payments yi to be made by the borrower to the
1350
QUARTERLY JOURNAL OF ECONOMICS
lender in the two states (i = 0 or 1). This contract can be thought of as a traditional incentive contract to solve the moral hazard problem in lending. If there were a perfect court system in the economy, then this contract would be sufficient to achieve efficient lending. (2) A transfer arrangement specifying payments hi (u, v ) to be made between agents in the social network if the borrower fails to make the payment yi . Here hi (u, v ) denotes a payment to be made by u to v in state i.32 Stage 3: Repayment. If an arrangement was reached in stage 2, the asset is borrowed and s earns an income of ω (V ), where ω(.) is a differentiable, nondecreasing function. Following the use of the asset, with probability p it is stolen. We assume that ω (V ) > pV for all V in the support of F, which guarantees that lending the asset is the socially efficient allocation. Even if the asset is not stolen, the borrower may choose to pretend that it is stolen and sell it at the liquidation value of φ · V , where φ < 1. The borrower then chooses whether to make the payment yi specified in the contract. Stage 4: Bad Links. At this stage, any link in the network may go bad with some small probability. We think of a bad link as the realization by a player that he no longer requires the business or friendship services of his friend. As we describe below, cooperation over bad links in the friendship game is no longer beneficial. Therefore, agents who learn that a link has gone bad will find it optimal not to make a promised transfer along the link. From a technical perspective, bad links are a tool to generate cooperation without repeated play, just like the “Machiavellian types” in Dixit (2003) (see also Benoit and Krishna [1985]). In an equilibrium where promised transfers are expected to be paid, failure by u to make a payment will be interpreted by v as evidence that the link has gone bad. In this case, v will defect in the friendship phase, which reduces the payoff of the deviator u by c (u, v ). To formalize bad links, assume that for every link of every agent, with a small probability ε > 0 independent across agents and links, the player learns that his link has gone bad at this stage. Thus, for any link (u, v ), the probability that the link has not gone bad is (1 − ε)2 ; and for any link (u, v ) where u does not learn that the link has gone bad, u still believes, correctly, that with probability ε the link has gone bad. 32. The circle of trust may restrict the links over which arrangements may be proposed. This case can be treated in the proof by assuming that G denotes the subgraph of permissible links.
TRUST AND SOCIAL COLLATERAL
1351
Stage 5: Transfer Payments. If the borrower chose to make payment yi in stage 3, then this stage of the game is skipped, and play moves on to the friendship phase. If the borrower did not make payment yi , then at this stage agents in the social network choose whether to make the prescribed transfers hi (u, v ). Each agent has a binary choice: either he makes the promised payment in full or he pays nothing. Stage 6: Friendship Game. Each link between two agents u and v has a friendship game with an associated value c(u, v). As long as the link is good, the friendship game is a two-player coordination game with two actions, with payoffs C D
C c(u, v) c(u, v) c(u, v)/2 0
D 0 c(u, v)/2 −1 −1
This game has a unique equilibrium (C,C) with payoff c (u, v ) to both parties, which represents the benefit from friendly interactions. A party only derives positive benefits if his or her friend chooses to cooperate; and benefits are highest when there is mutual cooperation. If a link has gone bad, cooperation is no longer beneficial, and the payoffs of the friendship game change: C D
C
D
−1 −1 0 0
0 0 0 0
Here, mutual cooperation leads to the low payoff of −1, capturing the idea that parties who are no longer friends might find it unpleasant to interact. If either party defects, the payoff of both parties is set to zero. The payoffs in the friendship game imply that if a player knows that a link has gone bad with probability 1, a best response is to play D. B. Model Analysis Because there is uncertainty in this model, we need to extend the concept of side deals to Bayesian games. DEFINITION 7. Consider a pure strategy profile σ and a set of beliefs μ. A side deal with respect to (σ, μ) is a set of agents S, a transfer arrangement hi (u, v ) for all u, v ∈ S, and a set of σu, continuation strategies and beliefs {( μu) | u ∈ S} proposed by s to agents at the end of stage 2, such that
1352
QUARTERLY JOURNAL OF ECONOMICS
σu, (i) Uu ( σ S−u, σ−S | μu) ≥ Uu σu , σ S−u, σ−S | μu for all σu and all u ∈ S, (ii) The beliefs μ satisfy Bayes’ rule whenever possible if play is determined by ( σ S , σ−S ), (iii) Uu ( σ S , σ−S | μu) ≥ Uu (σ S , σ−S | μ) for all u ∈ S, σ S , σ−S | μu) > Us (σ S , σ−S | μ). (iv) Us ( The only conceptually new condition is (ii), which is clearly needed in a Bayesian environment. Motivated by this definition, our equilibrium concept will be a side deal–proof perfect Bayesian equilibrium. THEOREM 2. There exists a side deal–proof perfect Bayesian equilibrium that implements borrowing between s and t if and only if the asset value V satisfies (11)
V ≤ T st (c) ·
(1 − ε)2 . φ + p(1 − φ)
Proof. We begin by analyzing the optimal incentive contract in the absence of enforcement constraints. Suppose that s makes payments xi (i = 0 or i = 1) in the two states of the world. What values of xi guarantee that s chooses to return the asset and t breaks even? To prevent s from stealing, the excess payment if the asset is reported stolen must exceed the liquidation value φV : x1 − x0 ≥ φV.
(12)
For the lender to break even, he has to receive at least pV in expectation: (13)
px1 + (1 − p) x0 ≥ pV.
The minimum transfers that satisfy (12) and (13) are (14)
x0 = p(1 − φ)V
and
x1 = [φ + p(1 − φ)]V.
When the enforcement constraints are brought back, it is intuitive that borrowing can be implemented in the network as long as max [x0 , x1 ] does not exceed the maximum flow between s and t: in that case, the lender can just transfer xi to the borrower along the network. Because x1 > x0 , this requires that x1 not exceed the maximum flow, or equivalently V ≤ c (s, t) ·
2 (1 − ε) , φ + p(1 − φ)
TRUST AND SOCIAL COLLATERAL
1353
which is indeed the condition in the theorem. We now turn to the proof. Sufficiency. We begin by showing that when (11) holds, a side deal–proof equilibrium exists. Let xi be defined by (14) and let yi = xi . By assumption, there exists a flow with respect to the capacity c that carries x1 / (1 − ε)2 from s to t. For all u and v, define h1 (u, v ) to be 1 − ε times the value assigned by this flow to the (u, v ) link. Similarly, let h0 (u, v ) be equal to 1 − ε times a flow that carries x0 / (1 − ε)2 from s to t. Now consider the strategy profile in which (1) the transfer arrangement (xi , hi ) is proposed and accepted, (2) the asset is borrowed and returned unless stolen, (3) every agent u pays every promised transfer hi (u, v ) if necessary, unless he learns that his link with v has gone bad, and (4) all agents play C in the friendship game unless they learn that the link has gone bad, in which case they play D. This strategy profile σ generates beliefs μ, and (σ, μ) constitute a perfect Bayesian equilibrium. To see why, note that conditional on others making the transfer payments, it is optimal for s to make the payments yi and not to steal the asset. Also, because hi (u, v ) ≤ (1 − ε) c (u, v ), all agents find it optimal to make the transfer payments given beliefs. Finally, because on-path play never gets to the transfers, all intermediate agents are indifferent between accepting the deal and rejecting it. In fact, even if the transfers were used in one or both states on path, intermediate agents would still break even, because hi are defined using flows. We also need to verify that the equilibrium proposed here is side deal–proof. Consider any side deal, and let S denote the set of agents involved. Suppose that after the side deal, the borrower reports that the asset is stolen with probability p ≥ p. Let T be the complement of S in W, and consider the ( S, T ) cut. By definition, the expected amount that flows through the ( S, T ) cut in state i if yi is not paid equals xi . If the borrower never chooses to pay yi in the side deal, he will have to make sure that at least p x1 + (1 − p ) x0 gets to the cut in expectation. Because all intermediate agents must break even in expectation, this implies that s’s expected payments must be p x1 + (1 − p ) x0 or more. Thus the side deal comes with a cost increase of ( p − p) [x1 − x0 ]. The increase in expected cost is easily seen to be the same if the borrower chooses to pay yi in one or both states. The expected benefit of the side deal is ( p − p) φV . By equation (12) the expected benefit does not exceed the expected cost; the side deal is not profitable to s, which is a contradiction. Hence the original arrangement was side deal proof.
1354
QUARTERLY JOURNAL OF ECONOMICS
Necessity. We now show that when (11) is violated, no side deal–proof equilibrium exists. We proceed by assuming to the contrary that a pure strategy side deal–proof perfect Bayesian equilibrium implements borrowing even though (11) fails. For simplicity, we assume that the equilibrium proposed transfers hi (u, v ) are expected to be paid by all agents u in stage 5 if the borrower chooses not to pay yi directly; that is, we only focus on equilibria where promises are kept. This condition is not necessary to obtain the result, but simplifies the proof somewhat. If this condition holds, then hi (u, v ) ≤ (1 − ε) c (u, v ) holds for all transfers proposed in equilibrium, because the amount by which u can expect to benefit from his friendship with v is at most (1 − ε) c (u, v ). Let χi = 1 if in state i on the equilibrium path, s chooses not to pay yi , and let χi = 0 otherwise. Case I. χ0 = χ1 = 1. In this case, on the equilibrium path, yi are never paid, and instead the transfer arrangements are always used. Define the expected transfer h = ph1 + (1 − p)h0 . By the individual rationality of intermediate agents, h satisfies weak flow conservation, and therefore by the lemma can be decomposed as h=
fu + h ,
u∈V, u=t
where fu is s → u flow and h = ft . In words, the fu flows deliver the expected profits to the intermediate agents, whereas h is an s→ t flow that delivers the expected payoff to the lender. Denote u=t fu = f ; then f is a weak flow delivering the payments to all intermediate agents. Our proof strategy will be the following. First, we take out the profits of all intermediate agents from the capacity c and the transfer h, essentially creating a “reduced” problem where intermediate agents are expected to break even. Then we construct a side deal for this simpler case using the maximum flow–minimum cut theorem, and finally, transform this into a side deal of the original setup. Let c (u, v ) = c (u, v ) − f (u, v ) / (1 − ε) be a capacity on G. Note that any flow g under c can be transformed into a flow g = g + f/ (1 − ε) that satisfies the capacity constraints c. Consider the functions hi = hi − f . It is easy to verify that hi / (1 − ε) satisfy the capacity constraints with respect to c and that h = ph 1 + (1 − p) h 0 . Let ( S, T ) be a minimal cut of the directed flow network with capacity c . By the maximum flow–minimum cut
TRUST AND SOCIAL COLLATERAL
1355
theorem, there exists a maximum flow g in the network that uses the full capacity of this cut. By assumption, the value of the cut un2 der h 1 satisfies h 1 (S, T )/ (1 − ε) ≤ g(S, T ) < x1 / (1 − ε) , which implies that (1 − ε) h1 (S, T ) − h0 (S, T ) < φV because (1 − ε) |h| ≥ pV . In words, the value flowing through the minimal cut in the two states does not provide sufficient incentives not to steal the asset. We now construct a side deal for the reduced problem. The idea is to construct a transfer arrangement that satisfies flow conservation inside S and delivers to the “boundary” of S the exact amount that was promised to be carried over to T under h . With such an arrangement, all agents in S will break even in each state, and thus the incentives that applied to S as a group will apply directly to agent s. Because S as a group did not have the right incentives, with the side deal s will not have the right incentives either. Formally, using the implicit summation notation, for each u ∈ S, g(u, T ), h 1 (u, T ), and h 0 (u, T ), let denote the amounts leaving S through u via the maximum flow g, h 1 , and h 0 . Clearly, (1 − ε)g(u, T ) ≥ h 1 (u, T ) and (1 − ε)g(u, T ) ≥ h 0 (u, T ). Now consider the restriction of g to the set S. This isa weak flow, and by as g = u∈S gu. Define the lemma it can be decomposed h
1 = u∈S (h 1 (u, T )/g(u, T )) · gu and h
0 = u∈S (h 0 (u, T )/g(u, T )) · gu. Then h
1 and h
0 are both weak flows in S, they satisfy hi
≤ (1 − ε)c , and they deliver exactly h 1 (u) and h 0 (u) to all u ∈ S. Thus hi
satisfies flow conservation within S, and delivers to the “boundary” of S the amount promised to be carried over to T under h 1 , as desired. The total value delivered by hi
is the value of the cut links under hi ; hence the amount that leaves s in the two states under h
satisfies (1 − ε)[|h
1 | − |h
0 |] < x1 − x0 , that is, is insufficient to provide incentives not to steal the asset. Now go back to the original network, and consider a side deal with all agents in the set S, where these agents are promised a transfer arrangement f + hi
. This is just adding back the profits of all agents to the side deal of the reduced problem. With this definition, the new side deal satisfies the capacity constraints f + hi
≤ (1 − ε)c because hi
≤ (1 − ε)c = (1 − ε)c − f . Second, all agents in S will be indifferent, because they get the same expected profits delivered by f (note that h
is a flow in both states and thus nets to zero state by state). The agents who have links that are in the cut are indifferent because h
is defined so that its inflow equals the required outflow for these agents. Third, the side deal
1356
QUARTERLY JOURNAL OF ECONOMICS
does not have enough incentives for s not to steal the asset, because |h
1 | − |h
0 | < φV /(1 − ε). Moreover, if the original deal was beneficial for s, then so is the new deal. This is because the cost of the original deal was | f | + |h |. The cost of the new deal if the borrower follows the honest asset-return policy is | f | + |h
|. But both h and h
are flows, and they are equal on the (S, T ) cut; hence they have equal values. Therefore, by following an honest policy, the borrower will have a cost equal to what he had to pay in the original deal. However, because the incentive compatibility constraint is not satisfied, the borrower is strictly better off always stealing the asset in the side deal. This argument shows that there exists a side deal in which the borrower is strictly better off, and all other players are best-responding; hence the original equilibrium was not side deal proof. It remains to consider the cases where either χ0 or χ1 is equal to zero. In these cases, define the expected transfer payments as h = pχ1 h1 + (1 − p)χ0 h0 . As above, h is a weak flow and thus f , the weak flow delivering the expected profits to all intermediate agents can be defined. Similarly, one can define c and hi , and letting ( S, T ) be the minimal cut of c , h 1 (S, T )/ (1 − ε) < x1 / (1 − ε)2 must hold. Case II. χ0 = 1 and χ1 = 0. Then h = (1 − p)h0 and the decomposition h = f + h yields h0 = f/(1 − p) + h /(1 − p), so that h 0 = h0 − f = f · p/(1 − p) + h /(1 − p) is a weak flow, because it is a sum of two weak flows. It follows that |h0 | = | f | + |h 0 | ≥ | f | + |h 0 (S, T )|. Therefore | f | + |h
0 | ≤ |h0 |, because h
0 is a flow and h
0 = h 0 on the (S, T ) cut. Moreover, incentive compatibility requires y1 − (1 − ε)|h0 | ≥ φV , whereas the break-even constraint of the lender means that py1 + (1 − p)(1 − ε)[|h0 | − | f |/(1 − p)] ≥ pV . Combining these inequalities gives y1 ≥ x1 + (1 − ε)| f |. Now consider the side deal hi
+ f defined as above. Because |h
0 + f | ≤ |h0 | ≤ y0 /(1 − ε) and |h
1 + f | < x1 /(1 − ε) + | f | ≤ y1 /(1 − ε), the borrower will strictly prefer this arrangement to the previous one. Because all intermediate agents get net profits delivered by f in both states in the side deal, they are indifferent. Thus the proposed arrangement is indeed a side deal. Case III. χ0 = 0 and χ1 = 1. Here h1 is a weak flow, which must deliver less than x1 /(1 − ε) to t, because by assumption x1 /(1 − ε) is more than the maximum flow. Thus incentive compatibility fails with the original agreement; even without any side deal, the lender is better off not returning the asset.
TRUST AND SOCIAL COLLATERAL
1357
Case IV. χ0 = 0 and χ1 = 0. Here a valid side deal is to pay y0 in state zero and propose the transfer arrangement h
1 for state 1. All intermediate agents are indifferent because they were getting zero in the original arrangement, and because h
1 < x1 / (1 − ε) ≤ y1 / (1 − ε), the expected payment in the side deal is strictly lower than in the original deal. In the proof so far, we have only considered the case where the borrower does not steal the asset on the equilibrium path. If the equilibrium is such that the borrower always steals, then min (1 − ε) |h1 | , y1 ≥ V must hold. If χ1 = 1, then h1 /(1 − ε) is a weak flow with respect to capacity c that must transfer at least V /(1 − ε)2 to t. This leads to a condition on the maximum s → t flow that is stronger than (11). If χ1 = 0, then a valid side deal is to propose the transfer arrangement h
1 for both states. As above, all intermediate agents are indifferent, and h
1 < x1 / (1 − ε) ≤ y1 / (1 − ε) holds, which proves that the expected payment in the side deal is strictly lower than in the original deal. APPENDIX III: EMPIRICAL MODEL The utility function (7) that forms the basis of the discretechoice model can be micro founded in the following way. Suppose that borrower s needs a loan of value V and needs to decide which of his friends to borrow from. Each potential lender t has an opportunity cost k (V ) + ν S of providing the loan, where ν S is a supply shock unobserved to the borrower, which is independent across lenders. If the borrower chooses lender t, he is expected to repay both the value and the lender’s full opportunity cost.33 Beyond the cost of a loan, the choice of lender is also influenced by the level of trust. We assume that the true level of trust between s and t is α + T st (c) + ε M , where α + ε M reflects both measurement error in network-based trust and other sources of trust. When the expected repayment k (V ) + ν S exceeds the level of social trust between borrower and lender, the excess amount must be secured using physical collateral. We assume that providing such physical collateral (e.g., a radio or a bicycle) has an opportunity cost that equals γ times the value of collateral. With these assumptions, 33. In many societies there is a social convention that agents are only to repay the nominal amount borrowed. However, there is often an understanding that lenders should be further compensated using in-kind transfers and gifts. Here we do not distinguish between these different forms of compensation.
1358
QUARTERLY JOURNAL OF ECONOMICS
the realized utility of borrowing from t is ω (V ) − γ · max 0, k (V ) + ν S − α − T st (c) − ε M − k (V ) − ν S , where ω (V ) is the utility from borrowing. In this expression ν S is unobservable, and hence s must take expectations over it. After taking expectations, we obtain (15) u V, T st (c) + ε M for some u function that is strictly increasing in the second argument when ν S has full support.34 If we also incorporate observed supply shocks ε S into the analysis, then the final utility representation becomes u V, T st (c) + ε M − ε S − ε S . Assuming that u is close to linear in the second argument, which would be the case if ν S had sufficient variance, letting ε S = (1 + 1/u2 ) ε S , where u2 is the derivative of u in the second argument, we can approximate this total utility as a linear function of T st (c) + ε M − ε S . In this representation, the error term captures a combination of supply shocks and measurement error in trust. YALE UNIVERSITY HARVARD UNIVERSITY AND NBER IOWA STATE UNIVERSITY UNIVERSITY OF CALIFORNIA–BERKELEY
REFERENCES Ali, Nageeb, and David Miller, “Enforcing Cooperation in Networked Societies,” U.C. San Diego, Working Paper, 2008. Allcott, Hunt, Dean Karlan, Markus Mobius, Tanya Rosenblat, and Adam Szeidl, “Community Size and Network Closure,” American Economic Review Papers and Proceedings, 97 (2007), 80–85. Ambrus, Attila, Markus Mobius, and Adam Szeidl, “Consumption Risk-Sharing in Social Networks,” Harvard University and U.C. Berkeley, Working Paper, 2008. Arrow, Kenneth J., The Limits of Organization (New York: W. W. Norton, 1974). Benoit, Jean-Pierre, and Vijay Krishna, “Finitely Repeated Games,” Econometrica, 53 (1985), 905–922. Berg, Joyce, John Dickhaut, and Kevin McCabe, “Trust, Reciprocity and Social History,” Games and Economic Behavior, 10 (1995), 122–142. Bian, Yanjie, “Bringing Strong Ties Back-In: Indirect Ties, Network Bridges, and Job Searches in China,” American Sociological Review, 62 (1997), 366–385. ——, “Getting a Job through a Web of Guanxi in China,” in Networks in the Global Village: Life in Contemporary Communities, B. Wellman, ed. (Boulder, CO: Westview Press, 1999). 34. Intuitively, if the trust flow is higher, there is a greater chance that the required repayment falls below it, in which case no physical collateral is needed.
TRUST AND SOCIAL COLLATERAL
1359
Bloch, Francis, Garance Genicot, and Debraj Ray, “Informal Insurance in Social Networks,” Journal of Economic Theory, 143 (2008), 36–58. Bohnet, Iris, Benedikt Herrman, and Richard Zeckhauser, “The Requirements for Trust in Gulf and Western Countries,” Harvard University and University of Nottingham, Working Paper, 2008. Bridges, William P., and Wayne J. Villemez, “Informal Hiring and Income in the Labor Market,” American Sociological Review, 51 (1986), 574–582. Brown, David, The Mobile Professors (Washington, DC: American Council on Education, 1967). Burt, Ronald S., Structural Holes: The Social Structure of Competition (Cambridge, MA: Harvard University Press, 1995). ——, “The Network Structure of Social Capital,” in Research in Organizational Behavior, No. 22, R. I. Sutton and B. M. Staw, eds. (Amsterdam: Elsevier Science, 2000). Calvo-Armengol, Antoni, and Matthew Jackson, “The Effects of Social Networks on Employment and Inequality,” American Economic Review, 94 (2004), 426–454. Coleman, James S., “Social Capital in the Creation of Human Capital,” American Journal of Sociology, 94 (1988), 95–120. ——, Foundations of Social Theory (Cambridge, MA: Harvard University Press, 1990). Cormen, Thomas H., Charles E. Leiserson, Ronald L. Rivest, and Clifford Stein, Introduction to Algorithms (Cambridge, MA: MIT Press, 2001). DiMaggio, Paul, and Hugh Louch, “Socially Embedded Consumer Transactions: For What Kind of Purchases Do People Most Often Use Networks?” American Sociological Review, 63 (1998), 619–637. Dixit, Avinash, “Trade Expansion and Contract Enforcement,” Journal of Political Economy, 111 (2003), 1293–1317. Ellison, Glenn, “Cooperation in the Prisoner’s Dilemma with Anonymous Random Matching,” Review of Economic Studies, 61 (1994), 567–588. Fehr, Ernst, and Simon Gachter, “Cooperation and Punishment in Public Goods Experiments,” American Economic Review, 90 (2000), 980–994. Fock, Henry K.Y., and Ka-shing Woo, “The China Market: Strategic Implications of Guanxi,” Business Strategic Review, 9 (1998), 33–43. Ford, Jr., Lester Randolph, and Delbert Ray Fulkerson, “Maximal Flow Through a Network,” Canadian Journal of Mathematics, 8 (1956), 399–404. Fukuyama, Francis, Trust (New York: Free Press, 1995). Glaeser, Edward L., David I. Laibson, Jos´e A. Scheinkman, and Christine L. Soutter, “Measuring Trust,” Quarterly Journal of Economics, 115 (2000), 811–846. Glaeser, Edward L., Bruce I. Sacerdote, and Jose A. Scheinkman, “The Social Multiplier,” Journal of the European Economic Association, 1 (2003), 345–353. Gorcoran, M., L. Datcher, and G. Duncan, “Information and Influence Networks in Labor Markets,” in Five Thousand American Families: Patterns of Economic Progress, Vol. 8 (Ann Arbor: University of Michigan, Institute for Social Research, 1980). Goyal, Sanjeev, and Fernando Vega-Redondo, “Structural Holes in Social Networks,” Journal of Economic Theory, 137 (2007), 460–492. Granovetter, Mark, “The Strength of Weak Ties,” American Journal of Sociology, 78 (1973), 1360–1380. ——, Getting a Job (Cambridge, MA: Harvard University Press, 1974). ——, “Economic Action and Social Structure: The Problem of Embeddedness,” American Journal of Sociology, 83 (1985), 481–510. Greif, Avner, “Contract Enforceability and Economic Institutions in Early Trade: The Maghribi Traders’ Coalition,” American Economic Review, 83 (1993), 525–548. Guiso, Luigi, Paola Sapienza, and Luigi Zingales, “Cultural Biases in Economic Exchange?” Quarterly Journal of Economics, 124 (2009), 1095–1131. Hardin, Russell, “The Street-Level Epistemology of Trust,” Analyse und Kritik, 14 (1992), 152–176. Ioannides, Iannis M., and Linda Datcher Loury, “Job Information Networks, Neighborhood Effects, and Inequality,” Journal of Economic Literature, 42 (2004), 1056–1093.
1360
QUARTERLY JOURNAL OF ECONOMICS
Jackson, Matthew, “The Economics of Social Networks,” in Advances in Economics and Econometrics, Theory and Applications: Ninth World Congress of the Econometric Society, Richard Blundell, Whitney Newcy, and Torsten Persson, eds. (New York: Cambridge University Press, 2006). Jacobs, Jane, The Death and Life of Great American Cities (New York: Random House, 1993). Johnson, Simon, John McMillan, and Christopher Woodruff, “Courts and Relational Contracts,” Journal of Law, Economics and Organization, 18 (2002), 221–277. Kandori, Michihiro, “Social Norms and Community Enforcement,” Review of Economic Studies, 59 (1992), 63–80. Karlan, Dean, “Using Experimental Economics to Measure Social Capital and Predict Financial Decisions,” American Economic Review, 95 (2005), 1688–1699. Karlan, Dean, Markus Mobius, Tanya Rosenblat, and Adam Szeidl, “Measuring Trust in Peruvian Shantytowns,” Harvard University Discussion Paper, 2008. Knack, Stephen, and Philip Keefer, “Does Social Capital Have an Economic Payoff? A Cross-Country Investigation,” Quarterly Journal of Economics, 112 (1997), 1251–1288. Kranton, Rachel E., “Reciprocal Exchange: A Self-Sustaining System,” American Economic Review, 86 (1996), 830–851. La Porta, Rafael, Florencio Lopez-de-Silanes, Andrei Shleifer, and Robert W. Vishny, “Trust in Large Organizations,” American Economic Review Papers and Proceedings, 87 (1997), 333–338. Lippert, Steffan, and Giancarlo Spagnolo, “Networks of Relations, Word-of-Mouth Communication, and Social Capital,” SSE/EFI Working Paper in Economics and Finance 570, Stockholm School of Economics, 2006. Macaulay, Stewart, “Non-Contractual Relations in Business: A Preliminary Study,” American Sociological Review, 28 (1963), 55–68. Marmaros, David, and Bruce Sacerdote, “How Do Friendships Form?” Quarterly Journal of Economics, 121 (2006), 79–119. Marsden, Peter V., and Jeanne S. Hurlbert, “Social Resources and Mobility Outcomes: A Replication and Extension,” Social Forces, 66 (1988), 1038–1059. McMillan, John, and Christopher Woodruff, “Interfirm Relationships and Informal Credit in Vietnam,” Quarterly Journal of Economics, 114 (1999), 1285–1320. Mobius, Markus, and Adam Szeidl, “Trust and Social Collateral,” NBER Working Paper 13126, 2007. ——, “The Dynamics of Trust,” Harvard University and U.C. Berkeley Working Paper, 2008. Putnam, Robert D., “Bowling Alone: America’s Declining Social Capital,” Journal of Democracy, 6 (1995), 65–78. ——, Bowling Alone: The Collapse and Revival of American Community (New York: Simon and Schuster, 2000). Saloner, Garth, “The Old Boys’ Network as a Screening Mechanism,” Journal of Labor Economics, 3 (1985), 255–267. Simon, Curtis J., and John T. Warner, “Matchmaker, Matchmaker: The Effect of Old Boy Networks on Job Match Quality, Earnings, and Tenure,” Journal of Labor Economics, 10 (1992), 306–330. Spagnolo, Giancarlo, “Social Relations and Cooperation in Organizations,” Journal of Economic Behavior and Organization, 38 (1999), 1–25. Standifird, Stephen S., and R. Scott Marshall, “The Transaction Cost Advantage of Guanxi-Based Bussiness Practices,” Journal of World Business, 35 (2000), 21–42. Townsend, Robert, “Risk and Insurance in Village India,” Econometrica, 62 (1994), 539–591. Udry, Chris, “Risk and Insurance in a Rural Credit Market: An Empirical Investigation in Northern Nigeria,” Review of Economic Studies, 61 (1994), 495–526. Uzzi, Brian, “Embeddedness in the Making of Financial Capital: How Social Relations and Networks Benefit Firms Seeking Financing,” American Sociological Review, 69 (1999), 319–344. Vega-Redondo, Fernando, “Building Up Social Capital in a Changing World,” Journal of Economic Dynamics and Control, 30 (2006), 2305–2338.
TRUST AND SOCIAL COLLATERAL
1361
Wasserman, Stanley, and Katherine Faust, Social Network Analysis: Methods and Applications (Cambridge, UK: Cambridge University Press, 1994). Watanabe, Shin, “A Comparative Study of Male Employment Relations in the United States and Japan,” Ph.D. thesis, Department of Sociology, University of California at Los Angeles, 1987. Watts, Duncan J., and Steven H. Strogatz, “Collective Dynamics of ‘Small-World’ Networks,” Nature, 393 (1998), 440–442. Wechsberg, Joseph, The Merchant Bankers (Boston: Little, Brown, 1966).
HOW DOES PARENTAL LEAVE AFFECT FERTILITY AND RETURN TO WORK? EVIDENCE FROM TWO NATURAL EXPERIMENTS∗ ¨ RAFAEL LALIVE AND JOSEF ZWEIMULLER This paper analyzes the effects of changes in the duration of paid, job-protected parental leave on mothers’ higher-order fertility and postbirth labor market careers. Identification is based on a major Austrian reform increasing the duration of parental leave from one year to two years for any child born on or after July 1, 1990. We find that mothers who give birth to their first child immediately after the reform have more second children than prereform mothers, and that extended parental leave significantly reduces return to work. Employment and earnings also decrease in the short run, but not in the long run. Fertility and work responses vary across the population in ways suggesting that both cash transfers and job protection are relevant. Increasing parental leave for a future child increases fertility strongly but leaves short-run postbirth careers relatively unaffected. Partially reversing the 1990 extension, a second 1996 reform improves employment and earnings while compressing the time between births.
I. INTRODUCTION Working parents of a newborn child have to give full attention to their baby and their jobs. Aiming to address this double burden for working parents, most OECD countries offer parental-leave (PL) provisions. However, although countries agree that parents of small children need support, the design of current PL systems differs strongly across countries. The purpose of this paper is to provide information on one key aspect of PL. We ask how PL duration affects a working mother who has just given birth to her first child. By studying the decision to give birth to a second child, we can provide information on the role of PL policy for ∗ We are grateful to Larry Katz and to four anonymous referees for their comments. Johann K. Brunner, Regina Riphahn, and Rainer Winkelmann and participants at seminars in Amsterdam, Basel, Frankfurt, Kobe, Rotterdam, Vienna, Zurich, SOLE 2006, ESSLE 2006, the Vienna Conference on Causal Population Studies, the IZA Prize Conference 2006, and the German Economic Association meeting of 2005 also provided valuable discussion on previous versions of this paper. We bear the sole responsibility for all remaining errors. Beatrice Brunner, ¨ Simone Gaillard, Sandra Hanslin, and Simon Buchi provided excellent research assistance. We are grateful for financial support by the Austrian Science Foundation (FWF) under the National Research Network S103, “The Austrian Center for Labor Economics and the Analysis of the Welfare State,” Subproject “Population Economics.” Further financial support by the Austrian FWF (No. P15422-G05), the Swiss National Science Foundation (No. 8210-67640), and the ForschungsStiftung of the University of Zurich (project “Does Parental Leave Affect Fertility Outcomes?” ) is also acknowledged.
[email protected];
[email protected]. C 2009 by the President and Fellows of Harvard College and the Massachusetts Institute of
Technology. The Quarterly Journal of Economics, August 2009
1363
1364
QUARTERLY JOURNAL OF ECONOMICS
higher-order births. This analysis is important for countries with fertility rates below the replacement level. Studying mothers’ return to work, postbirth employment, and postbirth labor earnings, the paper provides information on how PL duration affects subsequent work careers of working women. This analysis allows assessment of the extent to which extending parental leave facilitates balancing work and life. Moreover, studying the effects of parental leave on both fertility and work allows us to assess whether institutions that shape the terms of postbirth female employment spill over to fertility. Our analysis is based on the Austrian PL system. Under Austrian PL rules women can stay off work and return to the same (or a similar) job at the same employer thereafter. During the leave they receive a flat PL benefit of 340 euros per month. Interestingly, Austrian policymakers implemented two major reforms of the duration of PL—an extension of PL duration in 1990 and a reduction of PL duration in 1996. Specifically, before July 1, 1990, the maximum duration of PL ended with the child’s first birthday. The 1990 reform extended PL until the child’s second birthday for all children born on or after July 1. The 1996 reform partially reversed the extension granted in 1990 by taking away the last six months added in 1990. These policy changes create natural experiments that allow us to assess how changes in PL duration affect fertility decisions of a mother who has just given birth to a newborn child. Extending PL duration affects this decision in two different ways. First, the probability of a higher-order birth is potentially determined by the PL duration for the baby that is already born. This is what we call the current-child effect. This effect is potentially important in the Austrian context because women who give birth no later than 3.5 months after the end of a previous PL are exempt from the work requirement and can automatically renew PL eligibility for the second child. Before the 1990 reform, mothers needed to give birth to a new child within 15.5 months. Such a tight spacing of children is both biologically difficult and not desired by many parents. The 1990 reform increased this period to 27.5 months, thus providing much broader access to automatic renewal. The 1996 reform reduced the automatic renewal period to 21.5 months—a space between births that is biologically feasible and potentially desired. Second, the probability of a higher-order birth is also determined by PL duration for the baby yet to be born. This is what we call the future-child effect. Because PL duration directly
PARENTAL LEAVE, FERTILITY, AND RETURN TO WORK
1365
affects the costs associated with childbearing, the future-child effect is expected to increase fertility. This paper also studies how PL rules affect postbirth labor market careers of mothers with newborn children. The 1990 extension of leave encourages a mother to stay home with her child in the second year after birth and to delay return to work substantially. Future employment and labor earnings will be affected in two ways. First, providing parents with extended PL encourages mothers to stay off work longer and lowers employment and labor earnings immediately after a birth. This short-run effect is mechanical and intended by policy makers. Second, prolonged periods of absence from the workplace may lead to skill depreciation and weaker labor market prospects after labor market reentry. This potential for long-run deterioration of women’s postbirth careers is clearly not intended by policy. Although family policies in many countries are designed to support low-income (and often nonworking) women, the Austrian case is interesting because it affects working women of all income groups. However, it is not a priori clear for which group the PL rules generate the strongest incentives. The flat PL benefit implies that lower-income parents have a higher earnings replacement ratio. To shed light on the importance of cash transfers, we look at differences in response between high- and low-income women. The job protection policy may be more important for career-oriented women. This is because job protection shields working women from future income losses due to firm-specific human capital depreciation or deferred payment contracts. To shed light on the importance of job protection, we look at differences in responses between blue- and white-collar women. As firm-specific human capital and internal labor markets are arguably more important in white-collar professions, we would expect stronger responses from white-collar women. The empirical analysis draws on a unique and very informative data set, the Austrian Social Security Database (ASSD). Set up to provide information to calculate pension benefits for private sector employees (about 80% of Austrian employment), the ASSD collects detailed information on a woman’s earnings and employment history from employers; and it also contains information on take-up of PL benefits and on a woman’s fertility history from the point of time when she first worked in the private sector. We extract information on PL-eligible women who gave birth to the first child observed in ASSD in periods that cover the reform, and we
1366
QUARTERLY JOURNAL OF ECONOMICS
analyze subsequent fertility and labor market outcomes both in the short run (three years after the first birth) and in the long run (ten years after the first birth). Our empirical analysis uncovers five key results. First, we find that the extension of PL enacted in July 1990 had a strong impact on subsequent fertility behavior. We find that both the current-child PL effect and the future-child PL effect are quantitatively large. In the short run (within three years) fertility increases by 5 percentage points (15%) as a result of extended leave on the current child and by 7 percentage points (21%) as a result of extending leave for the future child. Second, we find not only that fertility increases temporarily, but also that this increase persists in the long run. Among women eligible for the more generous PL rules, three out of 100 women gave birth to an additional child within ten years after the birth of the first child who would not have done so with short leave. Although we do not observe the completed fertility cycles of mothers, we conclude that it is quite likely that the policy change affected not only the timing but also the number of births. Third, we find that most mothers exhaust the full duration of their leaves and that return to work is substantially delayed even after PL has been exhausted (by 10 percentage points in the short run and by 3 percentage points in the long run). Interestingly, although work experience and earnings decrease strongly in the short run, we do not find that longer leaves have long-run effects on work experience and cumulative earnings. Fourth, there are differential fertility responses of highand low-wage women and blue- and white-collar workers, indicating that both cash transfers and job protection have a sizable impact on fertility and labor market responses. Fifth, we find that the 1996 reduction of PL duration had a significant effect on the timing of subsequent births but no impact on the number of children. The 1996 partial reversal of the extension granted in 1990 also partially undoes the short-run reductions in employment and earnings generated by the 1990 extension of PL duration. This paper contributes to the literature on the impact of cash transfers on fertility behavior (Hardoy and Schøne 2005; Milligan 2005) and to the literature on the effects of welfare reform on fertility behavior of low-income women in the United States (Hoynes 1997; Moffitt 1998; Rosenzweig 1999; Joyce et al. 2004; Kearney 2004).1 Furthermore, Averett and Whittington (2001) study the 1. Bj¨orklund (2007) provides a survey of recent empirical work on the impact of family policies on fertility.
PARENTAL LEAVE, FERTILITY, AND RETURN TO WORK
1367
impact of the Family and Medical Leave Act in 1993 on fertility. Hoem (1993) studies the impact of PL rules (“speed premium”) for Sweden, and Piketty (2003) looks at parental education benefits in France. This paper also contributes to the literature studying the effects of family leave on labor market outcomes. Klerman and Leibowitz (1997, 1999) and Baum (2003) find only weak effects on employment and wages. Berger and Waldfogel (2004) for the United States, Baker and Milligan (2008) for Canada, and Ruhm and Teague (1997) and Ruhm (1998) for European countries find a closer relationship between PL provisions and the labor market attachment of mothers. Albrecht et al. (1998) show that PL-induced career interruptions are not associated with a wage penalty for women in Sweden. Sch¨onberg and Ludsteck (2007) study the causal effects of successive changes in PL duration on employment and earnings in Germany.2 Our paper adds to this literature in at least four ways. First, our empirical analysis provides convincing evidence on the effects of changing PL duration for the current child by adopting a quasiexperimental approach. Second, our study also provides evidence on the effect of changing PL duration on the future child. Understanding these two effects is crucial in PL design. Third, our results speak to the important issue of how policies that enhance the balance between work life and family life affect fertility behavior. This is different from many previous papers that have focused on the effect of cash transfers on fertility. Fourth, our empirical analysis allows assessing the effects of changes in PL on both short- and long-run labor market outcomes for mothers. This allows addressing the frequently raised concern that generous PL policies will harm mothers in the long run because extended periods off work lead to depreciation of human capital and worse future labor market prospects.3 The paper is organized as follows. The next section discusses the institutional setup and develops testable hypotheses as to how the reform might have affected fertility and labor market 2. Several recent papers study the effects of parental leave or child care on child development (Baker and Milligan 2008; Dustmann and Sch¨onberg 2008; Berger, Hill, and Waldfogel 2005; Baker, Gruber, and Milligan 2008). A further related literature analyzes the impact of financial incentives on fertility and labor supply using a structural approach. See Moffitt (1984) for an early approach to this question and Laroque and Salani´e (2005) for a more recent study of the effects of financial transfers on fertility and labor supply. 3. In a companion paper, we discuss how the two Austrian reforms affect the ¨ quality of mothers’ first postbirth jobs (Lalive and Zweimuller 2007). The analysis of the current paper is more comprehensive in providing a detailed assessment of the overall effects of PL on earnings and employment in all postbirth jobs.
1368
QUARTERLY JOURNAL OF ECONOMICS
behavior. Section III discusses the data and presents our empirical strategy for measuring the effects of PL duration on the current child and on the future child. Section IV presents the fertility and labor market effects of the increase in PL duration for the current child enacted with the 1990 reform. Section V studies the effect of the 1990 increase in PL duration for the future child. Section VI analyzes the impact of the 1996 reform, and Section VII concludes with a discussion of the relevance of our findings. II. BACKGROUND AND HYPOTHESES This section provides the institutional background of the Austrian PL system and discusses how two reforms in the 1990s may affect higher-order fertility and work careers of mothers. II.A. The Austrian PL System in the 1990s Working women have access to two types of family policies in Austria: maternity leave and parental leave. Maternity leave lasts for sixteen weeks (eight weeks before and eight weeks after the actual birth) and pays the average wage rate over the last quarter before the birth. Before July 1990, PL started after maternity leave ended and lasted until the child’s first birthday. To become eligible for PL a mother had to satisfy a work requirement. Women taking up PL for the first time had to have worked (and paid social security contributions) for at least 52 weeks during the two years prior to birth or be eligible for unemployment benefits—again fulfilling a work requirement of 52 weeks out of the two years prior to entering unemployment. For mothers with at least one previous take-up of PL or first-time mothers below the age of twenty years, the work requirement is reduced to twenty weeks of employment during the last year prior to birth. Moreover, PL is also renewed if the mother gives birth within a “grace period” that extends up to four months after the end of an earlier leave.4 4. The exact legal definition of the length of the grace period is that the work requirement is also abandoned if a new maternity protection period starts within a grace period of six weeks after the formal termination of a previous PL. Because maternity protection starts about eight weeks before the due date, the rules effectively imply that eligibility for PL is renewed for any new child expected to be born within fourteen weeks after the end of the previous leave. Because expected birth dates are not observed in our data, we consider a birth to be realized within the automatic renewal window if it occurs no later than four months after the end of the previous PL. Work exemptions for higher-order births are in place in countries where PL lasts long enough so that women could give
PARENTAL LEAVE, FERTILITY, AND RETURN TO WORK Work required for PL
Maternity leave and PL benefits
0
0
200
10
Weeks
Euros 600 400
20
800
30
1000
1369
0
2
4
6
8
10 12 14 16 18 20 22 24 26 28 30 32 34 36 Time since birth (months)
Before July 1990 July 1990 until June 1996
(A) Monthly transfer income
After July 1996
0
2
4
6
8
10 12 14 16 18 20 22 24 26 28 30 32 34 36 Time since birth (months)
Before July 1990 July 1990 until June 1996
After July 1996
(B) Work requirement for higher-order births
FIGURE I PL Benefits and Work Requirement for Higher-Order Births Figure shows the benefit path for a women earning real 1,000 euros per month before giving birth to her first child (A) and the number of weeks she needs to work for parental leave to cover the subsequent child (B). The parental leave benefit is 340 euros per month irrespective of prebirth monthly income. Dotted line refers to the situation before July 1990, solid line refers to the situation between July 1990 and June 1996, and dashed line refers to the post–July 1996 rules. Source. Austrian federal laws, various years.
PL provisions are twofold. On the one hand, PL protects the previous job. A mother has the right to return to her previous employer until PL ends. Moreover, she cannot be dismissed during the first four weeks after returning to work.5 On the other hand, PL is associated with a government transfer. A mother eligible for PL in 1990 received a PL benefit of about 340 euros per month (31 percent of gross median female earnings). Benefits are not means-tested and not taxed, implying a median net income replacement ratio of more than 40 percent. Single women (or women with a low-income partner) are eligible for higher benefit levels (Sonderunterstutzung). ¨ Figure IA shows the time path of transfer income for a PLeligible woman who has earned 1,000 euros per month during the quarter before birth. Maternity leave transfers amount to 1,000
birth to a new child while being covered by a PL from a previous child. In Germany, job protection is extended when a mother gives birth to a child within a current leave. The “speed premium” in Sweden grants higher PL benefits to parents who have subsequent children within sufficiently short intervals. Also, the PL systems of the Czech Republic, Slovakia, and Estonia feature renewal rules that are very similar to the Austrian system. 5. PL renewal makes mothers eligible for a maternity leave transfer that is eighty percent higher than the regular PL transfer. The maternity leave transfer of mothers who work in between two births equals the average wage in the three months prior to giving birth (the same as for a first birth). PL renewal leaves job protection unchanged.
1370
QUARTERLY JOURNAL OF ECONOMICS
euros for a period of about two months after birth.6 After maternity leave has been exhausted, this mother has two options. One is to arrange care for her newborn child and return to her prebirth job earning 1,000 euros per month. This option is complicated by the fact that the Austrian child care system for children under the age of three years is rather limited. Alternatively, she can provide care for her newborn child and take up PL, earning 340 euros per month until her child turns one year old. On her child’s first birthday, she can decide to return to her prebirth job, take up a new job, or continue to provide care for her child. In the event that she gives birth to a new child before her previous child turns 15.5 months old, she has access to renewed leave (Figure IB). Any child born after that date will be covered only if the mother has been working for at least twenty weeks prior to giving birth to the new child. In July 1990, a first PL reform increased the maximum duration of PL to two years and was enacted on July 1, 1990.7 A further reform in 1996 introduced a change in PL duration by introducing a one-partner PL maximum of eighteen months. Because Austrian fathers effectively do not take up PL, the one-partner maximum removed the last six months of PL that were added in 1990.8 6. Because maternity leave is initiated eight weeks prior to expected date of birth, the pre- and postbirth durations of maternity leave vary. 7. In December 1989, the PL system was changed from a “maternity” to a “parental” leave system, allowing for the father to go on PL also. However, this is of no practical consequence. In 1990 fewer than 1% of fathers took advantage of that possibility. A second change was that women in farm households and family businesses, as well as women who did not meet the employment requirements, became eligible for a transfer equal to 50% of regular PL benefits up until the child’s second birthday. This is of no importance in the present analysis because we confine ourselves to behavior of female dependent employees. Furthermore, the reform made it possible to take part-time PL, either between a child’s first and second birthday (by both parents at the same time) or between a child’s first and third birthday (only one parent or both parents alternating). 8. Compared to the U.S. Family and Medical Leave Act (FMLA), the Austrian PL rules are very generous. The FMLA grants twelve weeks of unpaid leave to employees in firms with more than fifty workers. Compared to current OECD systems, the pre-1990 PL system was of average generosity. The rules were very similar to those currently in place in Canada (twelve months PL, cash transfer 55 percent of previous earnings), Australia (twelve months PL, unpaid), or the United Kingdom (eighteen weeks paid maternity leave, thirteen months unpaid PL). Austrian post-1990 rules are more similar to those currently in place in continental Europe. The German system grants three years of PL and a generous cash benefit (100 percent of prior earnings on maternity leave, 67 percent of prior earnings for the first fourteen months of PL, and a flat transfer thereafter). In France mothers get three years of PL, 80 percent of prior earning for the first twelve months, and a flat transfer thereafter. Also, Sweden and Norway offer long leaves and PL benefits replace a very large fraction of prior income. Interestingly, the renewal option is not unique to the Austrian system. The Swedish “speed premium” shares similarities, as PL benefits are extended when an additional
PARENTAL LEAVE, FERTILITY, AND RETURN TO WORK
1371
II.B. Effects on Fertility and Work Careers Extending PL from one year to two years may affect future fertility for two different reasons: (i) a longer PL duration for the current child facilitates access to automatically renewed PL benefits for a new child; and (ii) a longer PL duration for the future child reduces directly the cost of this child. Arguably, taking advantage of PL renewal is difficult when PL leave is short. The one-year policy forces a mother who wants PL protection for the future child to conceive a new child quite early after the birth of the previous child.9 The 1990 reform adds twelve crucial months to the automatic renewal period. Thus, achieving automatic PL coverage for a future child is easier under the two-year policy than under the one-year policy (Figure IB).10 In contrast to the 1990 PL extension, the 1996 PL reduction did not change the biological feasibility of PL renewal. To become eligible without having to go back to work, a mother has to give birth to a new child within 21.5 months. By inducing mothers to give birth to a future child within the automatic renewal period, the 1990 PL extension is likely to change the spacing of births. Note, however, that any shock inducing mothers to give birth to planned children earlier may translate into a long-run increase in the total number of children. As fertility plans are realized earlier, shocks to partnerships, health, etc., that are inducing parents to give up family plans in a one-year system have weaker effects on fertility in a two-year system.11 child is born within two years after the birth of a previous child. The German system also grants the possibility of PL renewal with respect to job protection (but, unlike in the Austrian system, PL benefits do not cease when a parent goes back to work). The PL systems of the Czech Republic and Slovakia are almost identical to the Austrian system and have the PL renewal feature. 9. To see this more clearly, consider a woman who gives birth on September 1, 1988. She would be entitled to PL through September 1, 1989. To qualify for PL renewal, with the eight-week prebirth maternity leave and the six-week post-PL grace period, she would have to give birth by December 14, 1989. Note that this requires conceiving a new child by March 1989, no later than 5.5 months after giving birth to the previous child, implying a space between births of at most 15.5 months. 10. Under the two-year policy, a woman who gives birth on September 1, 1990, qualifies for PL renewal if a second child is conceived by March 1992, or 18 months subsequent to giving birth to the previous child. 11. Notice that when this argument is applied to the 1996 PL reduction, it is not clear whether this reform will lead to more or less children. On the one hand, the 1996 reform requires a tighter space between births for a mother who takes advantage of renewal. On the other hand, although the required space is biologically feasible, it is shorter than before and may induce some mothers to delay a planned birth. The first effect increases and the last effect decreases the number of births.
1372
QUARTERLY JOURNAL OF ECONOMICS
Extending PL may also affect higher-order fertility because it lowers the cost of having a future child. Ceteris paribus, a twoyear leave for the future child is more attractive than a one-year leave. Thus, the 1990 reform is expected to increase the number of children born to women exposed to the new policy. The second key aim of PL policies is to facilitate balancing family work and market work. Changes in the duration of PL for the current child may affect mothers’ work careers in two ways. Take-up of extended leaves delays return to work, lowers employment, and lowers labor earnings in the short run (0–36 months after the birth of the current child). Moreover, prolonged career interruptions may also lead to mothers’ postbirth careers deteriorating in the long run (37 to 120 months after the birth of the current child). Changes in PL duration for the future child are expected to affect short-run postbirth work careers only indirectly, via their effect on births. Austrian PL offers two distinct types of benefits: a flat transfer and job protection. Flat transfers translate into strong differences in replacement rates. We therefore expect strong heterogeneity in the responses to changes of PL in mothers with high earnings prior to birth and mothers with low prebirth earnings. Moreover, PL policies target not only costs associated with foregone current income but also costs associated with loss of lifetime income following a job loss. A longer duration of job protection may be particularly beneficial for mothers with firm-specific human capital or mothers who are on deferred payment contracts. To shed light on this issue, we compare women working in white-collar occupations to women in blue-collar jobs. Arguably, job-specific human capital, internal labor market, and career concerns are more important in white-collar jobs, so the job-protection channel should trigger stronger responses for white-collar women than for blue-collar women. III. DATA AND EMPIRICAL STRATEGY In this section we first discuss the available data. We then present the empirical strategies and explain the assumptions under which we identify the causal effect of changes in PL duration on fertility and labor market outcomes. III.A. Data Our empirical analysis is based on the Austrian Social Security Database (ASSD). This database collects information relevant
PARENTAL LEAVE, FERTILITY, AND RETURN TO WORK
1373
to old-age social security benefits. As these benefits depend on individuals’ earnings and employment histories, the database collects information on work histories for the universe of Austrian private sector employees. Furthermore, the database also contains information on exact dates of births. A disadvantage of the ASSD is partial recording of birth histories. ASSD records all births that occur after a woman’s first job in the private sector. This means that we can precisely determine the relative parity of any birth but we cannot determine any birth’s absolute parity.12 Our ASSD extract covers women giving birth to their first ASSD child in the years 1985, 1987, 1990, 1993, and 1996. We observe second-child births, return to work, employment, and earnings for these women until the year 2000, allowing us to analyze about ten years of a woman’s life for the 1990 reform and less than that for the 1996 reform. We focus on mothers who are likely to be at parity one because this yields a comprehensive picture of how changes in PL on this first child affect future fertility and work careers.13 We establish PL eligibility for these women by considering work careers two years prior to their giving birth to the first child. Note that measuring eligibility is complicated because a woman’s work career in the public sector is unobserved (but counts for PL eligibility) and because a woman’s parity is unobserved. Our eligibility indicator allocates a woman into the PL-eligible group if she demonstrates any employment or has ever been eligible for unemployment benefits in the two years prior to giving birth. Clearly, this definition of eligibility may give rise to misclassifying ineligible women into the eligible group, thus reducing take-up. More importantly, this encompassing definition of eligibility allows identification of a group of ineligible women (who neither worked nor received unemployment benefits in the two
12. Partial recording of previous births implies that we cannot precisely determine the parity of a birth. To make things precise, assume a working woman gives birth to a child at age thirty. If this woman is continuously employed in the private sector, we know this birth is her first birth and all subsequent births are recorded in the ASSD. However, if this woman entered the ASSD, say, at age 25 (e.g., because she was previously employed in the public sector and not covered by the ASSD), she could have could have given birth to children before entering the ASSD. More generally, if we observe x previous births in the data, we know that any subsequent birth is of parity x or higher. 13. Our focus here is on mothers, even though fathers could in principle take up PL provisions too. There are two reasons that we do not include fathers in our analysis. First, take-up by fathers is extremely low. Second, our database does not provide information on the dates of birth of a father’s children. Hence, fathers’ reactions to PL policies cannot be addressed in the present context.
1374
QUARTERLY JOURNAL OF ECONOMICS
years before giving birth) who do not go on to collect PL in our data. This finding makes us confident that we cleanly identify the group of PL-ineligible women—a group that is of importance in discussing the validity of the empirical strategy measuring the effect of changing PL duration for the future child. Furthermore, in line with demographic research, we restrict attention to women aged 15 to 45 years when giving birth to their first ASSD children. The ASSD allows constructing a set of four key outcome measures. Information on the date of birth of the second ASSD child allows measuring whether a mother gives birth to at least one additional child. Information on the date of return to work allows discussing return-to-work decisions. Information on the woman’s work and earnings career allows assessing employment and earnings in the two years prior to giving birth and up to ten years after birth. In the analysis below, we measure employment and earnings at a yearly frequency relative to the birthday of the first child. The set of conditioning variables comprises information on employment, unemployment, and earnings since entry into ASSD (either 1972 or time of entry into the labor market) and on a woman’s labor market position exactly one year prior to birth (employed or unemployed, industry and region of employer, daily labor income white-collar or blue-collar occupation).14
III.B. Empirical Strategy Our empirical strategy uses the 1990 and 1996 PL reforms to identify the effect of PL duration for the current child and the effect of PL duration for the future child. Table IA shows PL durations for the current and the future child for three cohorts of women who gave birth to a first child at three different dates: July 1990, June 1990, and June 1987.15 July 1990 mothers are eligible for 24 months of PL for the current child and PL renewal takes place when a future child is born within 27.5 months after the July 1990 birth. PL duration is 24 months for any child born within three years. June 1990 mothers are eligible for 12 months of PL for the current child and PL renewal is possible when a future child is born within 15.5 months. PL duration is 24 months for any child born within three years. June 1987 mothers are eligible for 12 months of PL for the current child and any 14. The data do not have information on hours, education, or marital status. 15. Table IB displays the analogous cohorts for the 1996 reform.
PARENTAL LEAVE, FERTILITY, AND RETURN TO WORK
1375
TABLE I EMPIRICAL DESIGN
Current child born
Parental leave current child
Automatic renewal
Parental leave future child (born within 36 months)
June 1987 June 1990 July 1990
(A) 1990 reform 12 months 15.5 months 12 months 15.5 months 24 months 27.5 months
12 months 24 months 24 months
June 1993 June 1996 July 1996
(B) 1996 reform 24 months 27.5 months 24 months 27.5 months 18 months 21.5 months
24 months 18 months 18 months
Source: Austrian Federal Laws, various years.
future child born within 36 months. PL is automatically renewed if the future child is born within 15.5 months. Identifying the Current-Child PL Effect. Can the currentchild PL effect be identified from a comparison of June 1990 mothers with July 1990 mothers? These two groups differ in the duration of the PL renewal period (and the associated PL duration for the first child), but they have the same PL duration for a future child. The crucial identification issue is to what extent mothers could have influenced the date of birth of the current child in anticipation of the policy change. There are at least two reasons that lead us to believe that mothers cannot have “timed” births. First, the conception of a child is an event that cannot be perfectly planned by parents. Second, even if parents could deterministically plan a birth, self-selection requires that parents have been informed of the July 1990 policy reform at the date of conception. We performed a content analysis of the major Austrian newspapers to check the information that potential parents had nine months before the June/July 1990 births, that is, in September/October 1989. The public discussion about the PL reform started in November 11, 1989, but the ruling coalition (social democrats and conservatives) discussed it until April 5, 1990, until it had designed a policy reform apt to find parliamentary approval. Because it was not clear until three months prior to the policy change whether a PL reform would take place and how it would be implemented, the June/July 1990 births were not influenced by anticipation of the July 1990 PL reform. However,
1376
QUARTERLY JOURNAL OF ECONOMICS
although anticipation of the PL reform at the date of conception is unlikely, it is still possible that mothers could have influenced the timing of a birth by postponing induced births or planned caesarean sections.16 We assess the presence of such fine tuning in two ways. First, analyzing the number of children born in June and July 1990, we do not find evidence of a spike in births on July 1, 1990. Second, because birth timing is likely to be strongest right around the reform date, we assess the sensitivity of our results by excluding births occurring one week before and one week after July 1, 1990.17 Because babies’ dates of birth assign extended PL and parents could not anticipate extended leave, we can identify the currentchild PL effect by comparing treated mothers giving birth (to the current child) in July 1990 to control mothers giving birth in June 1990.18 Although treatment and control samples are selected over two successive months, we consider their fertility and labor market outcomes over the following 36 months (short-run effects) and the following 120 months (long-run effects). Differences between treated and control mothers cannot be attributed to differences in the environment. In fact, the treated and control mothers are facing different parental leave incentives but practically identical economic conditions following the June/July 1990 birth. Identifying the Future-Child PL Effect. To identify the effect of PL duration on the future child, we compare short-run (0–36 months after birth) and medium-run (37–72 months after birth) outcomes of June 1987 to June 1990 mothers. Identification of the 16. Gans and Leigh (2006) show that the introduction of the Australian baby bonus on July 1, 2004, led to a significant increase in the number of births on that same day, suggesting that parents postponed their births to ensure they were eligible for the bonus. Similarly, Dickert-Conlin and Chandra (1999) show that the U.S. tax system creates an incentive to give birth to a child on the 31st of December rather than on the 1st of January. They find that the probability that a child is born in the last week of December, rather than the first week of January, is positively correlated with tax benefits. 17. Mothers could also have changed prebirth work patterns in order to become eligible for extended leave. Empirical evidence on prebirth employment in Figure VC is not consistent with such qualification effects. 18. This is essentially a regression discontinuity framework (RDD). Denote the treatment status of a mother by D, where D = 1 if a mother has access to a two-year leave, and D = 0 if a mother has access to a one-year leave. Eligibility for treatment is a discontinuous function of the current child’s date of birth T . Denote by t the date when policy changes (July 1, 1990, for the change from the 12-month to the 24-month policy and July 1, 1996, for the change from the 24-month to the 18-month policy). Provided that lim→0 Pr(D = 1|T = t + ) = lim→0 Pr(D = 0|T = t − ), our design satisfies the “fuzzy” RDD assumption (Hahn, Todd, and Klaauw 2001). Figure II is consistent with this assumption being strongly satisfied.
PARENTAL LEAVE, FERTILITY, AND RETURN TO WORK
1377
future-child PL effect requires stronger assumptions than those needed to identify the current-child PL effect. The central identifying assumption is that there are no substantial cohort or time effects that pollute the comparison between June 1987 and June 1990. We propose three ways of assessing the plausibility of this assumption. First, we analyze PL-ineligible women as a control group and check whether outcomes for June 1990 mothers follow a different trend than outcomes for June 1987 mothers, finding no significant time trend. Clearly, this small control group is less than perfect because it encompasses women with weak prebirth labor market attachment. Second, we study time and cohort trends for PL-eligible mothers both before and after the 1990 reform. Third, we also exploit the way PL rules change with time since first birth. Although differences in PL rules between the treated and the control group exist during the first 36 months, the same rules apply 37–72 months after birth. Thus, treated and control mothers should differ in the period 0–36 months after birth but less so in the period 37–72 months after birth. Finally, adding up the effect of extending current leave with the effect of extending future leave allows estimating the total effect of PL duration and PL renewal. Arguably, this total effect comes close to the effects generated by moving fully from a oneyear to a two-year system—an effect of prime policy interest. IV. EXTENDING PL DURATION FOR THE CURRENT CHILD This section discusses the effects of extending PL duration for the current child on fertility and return-to-work behavior. The analysis proceeds in three steps. We first document the fertility effects of changing PL duration for the current child. Next, we turn to labor market responses. And finally, we assess the potential heterogeneity in the responses to the reform by groups that differ in income and in broad occupation IV.A. The Impact on Fertility The ASSD reports information on PL take-up. Focusing on PL eligible mothers of newborn children, Figure II reports average PL take-up (including zeros) associated with a first child born between June 1 and July 30, 1990. June 1990 mothers are eligible for ten months of the parental leave (excluding the first two months of maternity leave). Results indicate that of these ten months, June 1990 mothers take up on average nine months of PL.
1378
0
2
4
6
Months 8 10 12 14 16 18 20 22 24
QUARTERLY JOURNAL OF ECONOMICS
1/6/90
16/6/90
1/7/90 Calendar day Before
16/7/90
31/7/90
After
FIGURE II Parental Leave Taken with Current Child June smoothed backward, July smoothed forward (15-day moving average). Source. ASSD, own calculations. Sample restricted to PL-eligible women giving birth to a child in June 1–30 or July 1–30 of 1990.
In contrast, average PL take-up by July 1990 mothers amounts to 19 months, which is considerably more than for any mother giving birth in June 1990. Interestingly, average PL take-up is about 85 percent of the 22 months covered by PL after the 1990 reform (after two months of maternity leave). This suggests that the second year of leave is valued by the majority of eligible women. Figure II suggests comparing treated mothers who give birth in July 1990 to control mothers who give birth in June 1990. How informative is this contrast on the causal effect of extending PL duration for the current child? Table II provides descriptive evidence on PL take-up by years since birth as well as on key prebirth characteristics. Treated and control mothers are identical with respect to PL take-up in the first year after giving birth to their current child. Both cohorts take up about 9.2 months out of the roughly 10 months offered by the PL system. Striking differences in PL take-up appear in the second year after birth. Whereas treated mothers spend about ten months on PL, control mothers spend less than one month on PL (with their second child).
PARENTAL LEAVE, FERTILITY, AND RETURN TO WORK
1379
TABLE II DESCRIPTIVE STATISTICS, MOTHERS GIVING BIRTH IN JUNE AND JULY 1990 July 1990 Mean Parental leave, yr 1 (mths) Parental leave, yr 2 (mths) Age 20–24 Age 25–29 Age 30–34 Age 35–44
9.208 10.082
SD
June 1990 Mean
(A) Treatment (2.194) 9.196 (3.486)
0.795
(B) Demographics 0.423 (0.494) 0.443 0.346 (0.476) 0.343 0.109 (0.311) 0.087 0.029 (0.167) 0.025
SD
Est
SE
(2.139)
0.012 (0.055)
(2.209)
9.287 (0.074)∗∗∗
(0.497) −0.02 (0.475) 0.004 (0.282) 0.022 (0.156) 0.004
(C) Labor market history Employment (years) 5.846 (3.94) 5.701 (3.792) Unemployment (years) 0.265 (0.543) 0.257 (0.484) Earnings not observed? 0.028 (0.165) 0.026 (0.16) (1 = yes) Daily earnings (euros) 34.49 (42.942) 33.313 (37.719) Employed (1 = yes) White collar (1 = yes) Daily 1989 earnings (euros) Observations
Contrast
(0.013) (0.012) (0.008)∗∗∗ (0.004)
0.144 (0.098) 0.008 (0.013) 0.002 (0.004) 1.178 (1.026)
(D) One year before birth 0.868 (0.339) 0.867 (0.339) 0.001 (0.009) 0.435 (0.496) 0.441 (0.497) −0.006 (0.013) 36.826 (20.935) 36.142 (20.397) 0.684 (0.526) 3,225
2,955
Source: ASSD, own calculations. Sample restricted to PL-eligible women giving birth to a child in June 1–30 or July 1–30 of year 1990. Notes: Mean and standard deviation (in parentheses) for women giving birth to their first child in June 1–30 or July 1–30, 1990. Labor market history covers the period from January 1972 to date of birth in June or July 1990. Labor earnings are unobserved for women coming from the public sector, which is not covered by ASSD. Daily earnings refer to real mean earnings measured in year 2000 euros per day worked—real total labor earnings divided by work experience. Daily 1989 earnings are earnings on the prebirth job measured in year 2000 euros—the job held exactly one year before giving birth. Data also contain information on region and industry of the prebirth employer.
In contrast to PL take-up, both cohorts appear to be quite similar with respect to prebirth characteristics. Both groups show similar amounts of previous work and unemployment experience and previous average labor earnings. Also, labor market status one year before the 1990 birth differs only slightly. Although July 1990 mothers are significantly more likely to be in the age bracket 30–34 years than June 1990 mothers, the overall age composition is quite comparable. Almost 80 percent of all births occur in the age group 20–29 and about 13 percent at ages thirty and older.
1380
25
30
Percent
35
40
QUARTERLY JOURNAL OF ECONOMICS
1/6/90
16/6/90
1/7/90 Calendar day Before
16/7/90
31/7/90
After
FIGURE III How Does Parental Leave Affect Higher-Order Fertility? Figure reports the percentage of women who gave birth to at least one additional child within three years after giving birth in June or July 1990. June smoothed backward, July smoothed forward (15-day moving average). Source. ASSD, own calculations. Sample restricted to PL-eligible women giving birth to a child in June 1–30 or July 1–30 of 1990.
Table II also indicates that there are substantially more births in July 1990 than in June 1990. Is this evidence for birth timing? We investigate this issue by analyzing the number of births in June and July 1990 on a day-to-day basis, finding a steady increase but no discontinuity in the number of births on July 1 (not reported). Thus, comparing June 1990 mothers to July 1990 mothers, we find little evidence of seasonality in the composition of cohorts but strong seasonality in the number of births.19 Figure III presents first evidence on the causal effect of extending PL for the current child on the decision to have an additional child. The vertical axis measures the percentage of women who gave birth to a second child within the 36 months following 19. Indeed, births in July exceed births in June in any given year of our sample period (1985, 1987, 1990, 1993, 1996). Nevertheless, we perform sensitivity tests comparing births that occur, respectively, in the first/second half of June 1990 and the first/second half of July 1990 to assess the sensitivity of our results to short-run timing of births.
PARENTAL LEAVE, FERTILITY, AND RETURN TO WORK
1381
the 1990 birth. To smooth out the noise in date-of-birth data, Figure III presents fifteen-day backward moving averages for June and fifteen-day forward moving averages for July. Results indicate that 32.2 of 100 women in the control group give birth to an additional child within the 36 months. In contrast, almost 36.7 of 100 women do so in the treated group. Thus, almost 5 of 100 women tend to give birth to an additional child in the treated cohort who do not do so in the control group. The magnitude of this effect seems quite robust and varies only slightly over the particular time window one adopts. Table III discusses the validity of this result, explaining the probability of giving birth to an additional child within three years in the context of a linear probability model. Column (1) estimates a baseline difference in short-run higher-order fertility of about 4.5 additional births per hundred mothers. Column (2) includes information on age and prebirth labor market history to assess the sensitivity of the key result to composition of treated and control group mothers. Results indicate that extended PL increases short-run fertility by 4.9 additional children per hundred women. Moreover, estimates indicate that higher-order fertility is lower for older women and for employed women. Column (3) estimates the causal effect of extended leave on births by reducing the width of the baseline window from thirty to fifteen days. Results suggest that about 5.4 children are born to one hundred women with extended leave that are not present under short leave. Anticipating extension of leave, mothers with a strong desire to have two or more children might have timed the birth of their first child to take place on or after July 1, 1990 (by postponing a planned caesarean section or delayed induction of a birth to July 1 or later). Column (4) excludes births taking place one week before and one week after July 1, 1990. Excluding these observations leaves results essentially unchanged; point estimates even slightly increase. The last column of Table III runs a placebo regression where we repeat the regression of column (2) using data on mothers giving birth to their first ASSD child in June and July 1987. These two groups faced identical PL rules and hence we should not see any major differences between them. In fact, the estimated treatment effect is insignificant and the point estimate very close to zero. The empirical analysis has documented a short-run response of higher-order fertility. Does this short-run response persist in the long run? Contrasting June and July 1990 mothers, Figure IV shows how extended leave for the first child affected higher-order
1382
QUARTERLY JOURNAL OF ECONOMICS
TABLE III THE THREE-YEAR FERTILITY EFFECT OF EXTENDING PL DURATION FOR THE CURRENT CHILD, JULY 1990 (24 MONTHS PL) VS JUNE 1990 (12 MONTHS PL) Base July
.045 (.012)∗∗∗
No No 0.345
.049 (.012)∗∗∗ .045 (.024)∗ .029 (.028) −.059 (.033)∗ −.111 (.043)∗∗∗ .006 (.006) −.057 (.038) .011 (.022) −1.793 (.694)∗∗∗ −.025 (.045) .000 (.000) −.000 (.000) −.111 (.050)∗∗ −.011 (.017) .002 (.002) −.000 (.002) Yes Yes 0.345
.054 (.017)∗∗∗ .035 (.034) .014 (.039) −.073 (.046) −.116 (.062)∗ .003 (.009) −.048 (.054) .028 (.034) −1.850 (1.202) −.038 (.060) −.000 (.000) .000 (.000) −.036 (.075) −.031 (.024) .001 (.003) .000 (.003) Yes Yes 0.346
.056 (.014)∗∗∗ .042 (.027) .028 (.031) −.058 (.037) −.099 (.049)∗∗ .006 (.007) −.052 (.044) .011 (.025) −1.783 (.752)∗∗ −.004 (.051) .000 (.000) −.000 (.000) −.121 (.057)∗∗ −.004 (.020) .002 (.002) .000 (.002) Yes Yes 0.345
6,180
6,180
3,045
4,757
Age 20–24 Age 25–29 Age 30–34 Age 35–44 Employment (years) Employment sq. Unemployment (years) Unemployment sq. Earnings unobserved Daily earnings (euros) Daily earnings sq. Employed White collar Daily 1989 earnings Daily 1989 earnings sq. Industry Region Mean of dependent variable N
Controls Half-window Anticipation
Placebo .008 (.011) .012 (.021) .048 (.025)∗ .020 (.033) −.125 (.034)∗∗∗ .008 (.007) −.104 (.046)∗∗ −.020 (.028) 1.065 (1.233) .062 (.046) .001 (.000)∗∗ −.000 (.000)∗∗∗ −.099 (.045)∗∗ −.005 (.016) .002 (.002) −.001 (.002) Yes Yes 0.258 6,151
Source: ASSD, own calculations. Sample covers PL-eligible women giving birth to their first child in June 1–30 or July 1–30 of the respective years. Notes: Linear model of the probability of giving birth to a second child within three years of giving birth to a first child in June/July 1990. Standard error in parentheses. ∗ (∗∗ ,∗∗∗ ) denote significance at the 10% (5%, 1%) level, respectively. Inference based on Huber–White standard errors. “Base”: July (24 months PL) vs June (12 months PL). Controls: adds controls (Table II). Half-window: June 16–July 15. Anticipation: June 1–23 and July 8–30. Placebo: June 1987 (12 months PL) vs July 1987 (12 months PL).
1383
0
0
1
20
Percent 40
Percent 2
3
60
4
80
PARENTAL LEAVE, FERTILITY, AND RETURN TO WORK
0
12
24
36
48 60 72 84 Time since birth (months) June 1990
96
July 1990
108
120
0
12
24
36
48 60 72 84 Time since birth (months) June 1990
96
108
120
July 1990
FIGURE IV Additional Births (“Hazard” and Cumulative Proportion), July 1990 (24 Months PL) vs. June 1990 (12 Months PL) Figure reports the additional child hazard, that is, the women giving birth to an additional child in month t as a proportion of those who have not given birth to an additional child up to month t (A), and the cumulative proportion of women giving birth to at least one additional child up to month t (B). Vertical bars indicate end of automatic renewal (dashed for June 1990 mothers, regular for July 1990 mothers). Source. ASSD, own calculations. Sample restricted to PL-eligible women giving birth to a child in June 1–30 or July 1–30 of 1990.
fertility within the ten years following the 1990 birth. Figure IVA shows the second-child hazard rate, that is, the likelihood that a woman gives birth to a second child t months after the 1990 birth conditional on not giving birth to a second child before month t. The control group has a somewhat higher second-child hazard rate between months 12 and 16, whereas the treated group has a much higher hazard between month 18 and month 28. The difference between the two groups is largest during months 22–25, when the additional birth hazard is almost 3.5% for the treated group but less than 2% for the control group. After month 28 there are no major differences between the two groups. This pattern can be rationalized by the PL rules. Recall that the rules grant renewal of PL to control group mothers giving birth before month 16 and to treated mothers giving birth before month 28. Figure IVA shows that the additional-child hazard diverges most strongly when PL renewal is possible to treated mothers but impossible to control mothers. Figure IVB shows the cumulative proportion of women with a second child by time since the 1990 birth. Results indicate that the treated have a lower second-child probability before month 22 but a higher one thereafter. Interestingly, the difference does not erode in the long run. Even ten years after the 1990 birth, the
1384
QUARTERLY JOURNAL OF ECONOMICS
TABLE IV SHORT-RUN AND LONG-RUN FERTILITY EFFECTS OF PL DURATION FOR THE CURRENT CHILD, JULY 1990 (24 MONTHS PL) VS JUNE 1990 (12 MONTHS PL) Base Additional birth 0–36 months Additional birth 0–120 months Additional birth 0–16 months Additional birth 17–28 months Additional birth 29–120 months Observations
Controls
Half-window Anticipation Placebo
.045 (.012)∗∗∗ [.345] .03 (.012)∗∗ [.617]
.049 (.012)∗∗∗ [.345] .035 (.012)∗∗∗ [.617]
.054 (.017)∗∗∗ [.346] .03 (.017)∗ [.62]
.056 (.014)∗∗∗ [.345] .048 (.014)∗∗∗ [.617]
.008 (.011) [.258] −.006 (.012) [.556]
−.027 (.006)∗∗∗ [.066] .082 (.01)∗∗∗ [.193] −.024 (.012)∗ [.36]
−.026 (.006)∗∗∗ [.066] .084 (.01)∗∗∗ [.193] −.021 (.012)∗ [.36]
−.029 (.009)∗∗∗ [.067] .084 (.014)∗∗∗ [.195] −.023 (.017) [.359]
−.023 (.007)∗∗∗ [.066] .09 (.011)∗∗∗ [.194] −.018 (.014) [.359]
−.006 (.006) [.069] .011 (.008) [.123] −.011 (.012) [.366]
6,180
6,180
3,045
4,757
6,151
Source. ASSD, own calculations. Sample: PL eligible women giving birth to their first child in June 1–30 (12 months PL) or July 1–30 (24 months PL) in the year 1990. Notes. This table reports the “July 1990” parameter estimate in a linear probability model comparing postbirth fertility of mothers giving birth to their first child in June/July 1990. Standard error in parentheses; mean of dependent variable in brackets. ∗ (∗∗ ,∗∗∗ ) denote significance at the 10% (5%,1%) level, respectively. Inference based on Huber–White standard errors. Base: July (24 months PL) vs. June (12 months PL). Controls: adds controls (Table II). Half-window: June 16–July 15. Anticipation: June 1–23 and July 8–30. Placebo: June 1–23, 1987 (12 months PL) vs. July 8–30, 1987 (12 months PL).
second-child probability of July 1990 mothers is still three percentage points higher than for June 1990 mothers. This suggests that the increase in fertility created by the PL renewal effect affects not just the timing but also the total number of children.20 Table IV provides an econometric assessment of both shortrun and long-run fertility effects using a linear probability model. The first row repeats the results of Table III on short-run fertility. The second row shows the corresponding result for long-run fertility (birth of a second child within 120 months). Column (1) shows that the effect of extending PL for the child born in 1990 20. Note that June 1990 mothers might still catch up to July 1990 mothers after ten years or due to differential third-child fertility. Although our data provide a window that is, arguably, too short to provide a definitive assessment of completed fertility, we believe that this is unlikely to happen. First, June and July 1990 mothers face identical economic and political circumstances on the third child. Second, because only about 65 percent of mothers give birth to two or more children, the third-child treatment effect would have to reach an implausibly large magnitude.
PARENTAL LEAVE, FERTILITY, AND RETURN TO WORK
1385
leads to three additional children being born to one hundred mothers within ten years. Adding controls (column (2)) increases the treatment effect to 3.5 percentage points; halving the estimation window and excluding births closer than seven days before and after July 1, 1990, does not reverse the result. We conclude that extending PL for the current child increases long-run fertility.21 Rows (3)–(5) in Table IV document the timing of excess fertility. Column (1) in Table IV shows that treated mothers reduce future-child fertility by 2.6 percentage points in the period when both treated mothers and control mothers have access to automatic renewal, that is, between months 0 and 16 (row (3)). Then there is a strong increase in fertility by 8.4 percentage points in the period when only treated mothers have access to automatic renewal, that is, months 17–28 (row (4)). Finally, treated mothers are slightly less likely to give birth to a second child between month 29 and month 120 (row (5)). This is the period when neither group has access to automatic renewal. In sum, our results suggest that the short-run change in access to automatic renewal leads to long-run effects on higher-order fertility. The most likely explanation for the high persistence of fertility effects is shocks (to health, partnership, workplace, etc.) that may otherwise induce parents to revise their long-run plans. We show that more generous PL induces parents to realize a planned birth earlier. This means that some shocks that are inducing parents to change family plans in a one-year system no longer affect family planning under a two-year system. This is why short-term gains in fertility also persist in the long run.22 IV.B. Labor Market Outcomes PL rules address the problems of parents in reconciling work and child care. Hence these rules also affect parents’ labor market outcomes. We now explore whether and to what extent extending 21. Although the effect estimated here seems large, our estimated short-run impact is similar in magnitude to that found by Milligan (2005) for pronatalist transfer policies in Quebec where, depending on the parity, parents got a cash transfer up to 8000 CAD. Fertility of those eligible is estimated to have increased 12% on average and 25% for those eligible for the maximum benefit. 22. Although it is true that some women are induced to have a birth within 28 months who would have waited and then never conceived (for preference or biological reasons), there also appear to be some women who are induced to wait beyond 16 months. To the extent that these women experience a negative shock, the net positive effects on the other set of women will be offset. Because there is a positive net fertility effect, the data suggest that any offsetting of this kind is not complete.
1386
QUARTERLY JOURNAL OF ECONOMICS
leave from one year to two years affects women’s labor market outcomes, by focusing on three different indicators. (i) Return to work measures the probability that a woman has returned to work at least once after giving birth to her first child in June or July 1990 in the short run (0–36 months after birth) and in the long run (0–120 months after birth). (ii) Employment refers to the months worked per calendar year after birth in the short run (0–36 months after birth) and in the long run (37–120 months after birth). (iii) Earnings measures average pay earned per calendar day after birth in the short run (0–36 months after birth) and in the long run (37–120 months after birth). Note that both employment and earnings are set to zero in periods where a woman does not work and are included in the empirical analysis.23 Figure VA compares the proportion of women returning to work within 36 months after a birth in June and July 1990 by day of birth. Whereas about 62 of 100 women return to work three years after giving birth in June 1990, only about 52 of 100 women do so after giving birth in July 1990, with a strong discontinuity from June 30 to July 1. This suggests a very strong causal impact of PL duration on short-run return-to-work behavior. Figure VB compares the return-to-work profile during the 120 months following the 1990 birth. The figure clearly shows that the maximum length of PL has an extremely strong impact on return-to-work behavior. About 10% of mothers return to work within two months after giving birth (i.e., the end of maternity protection), the same for July 1990 and June 1990 mothers. About 80% of the treated and 83% of the control mothers exhaust the full PL duration. Although a substantial fraction of both treated (20%) and control mothers (25%) return to work exactly when PL has run out, the majority of women (60% among the treated, 58% among controls) stay home after PL has lapsed. Moreover, extended leave for the 1990 child seems to lower the fraction ever returning to work. Whereas 85 of 100 women return to work at least once ten years after the 1990 birth, only 82 of 100 treated women do so. 23. Return to work at date t measures the probability that a woman has stopped her baby break between date 0 and date t. In contrast, employment counts the days at work between date 0 and date t (set to zero when a woman does not work). Because a woman could have returned to work at some date s < t but dropped out of workforce at some later date τ ∈ (s, t), the two indicators differ. Unconditional labor earnings are average earnings per month and set to zero when a woman does not work at all during the respective month. Employment and earnings are available at an annual frequency.
1387
0
40
20
50
Percent 60
Percent 60 40
70
80
80
100
PARENTAL LEAVE, FERTILITY, AND RETURN TO WORK
1/6/90
16/6/90
1/7/90 Calendar day Before
16/7/90
0
31/7/90
12
24
36
After
48 60 72 84 Time since birth (months) June 1990
96
108
120
July 1990
(B) Proportion returning
0
0
2
10
4
Months 6
8
Year 2000 Euros 30 20
10
12
40
(A) Return to work
0
12
24
36 48 60 72 Months since birth
June 1990
84
July 1990
(C) Employment
96
108
120
0
12
24
36 48 60 72 Months since birth
June 1990
84
96
108
120
July 1990
(D) Earnings
FIGURE V Return to Work, Employment, and Labor Earnings, June 1990 vs. July 1990 Panel A reports the percentage of women who have returned to work at least once within three years after the 1990 birth (June smoothed backward, July smoothed forward, fifteen-day moving average); Panel B reports the cumulative proportion of women who have returned to work at least once since the 1990 birth; Panel C reports average months in employment; and Panel D reports mean labor earnings per calendar day since the 1990 birth. (Panels C and D are drawn on an annual frequency; data points at six, eighteen, etc., months refer to the first, second, etc. year after the 1990 birth.) Employment and earnings are set to zero for women who do not hold a job. Zeros are included in all our analyses. Source. ASSD, own calculations. Sample restricted to PL-eligible women giving birth to a child in June 1–30 or July 1–30 of year 1990.
Figure VC explores the effects of extended leave on employment. Employment patterns of women giving birth to their first child are strikingly asymmetric. Whereas paid work takes up about nine to ten months per prebirth year, time spent in the workplace is below seven months in all postbirth years. Interestingly, adverse PL effects on return to work do not translate into lower employment rates. Whereas there is a clear short-run employment disadvantage of treated mothers compared to control mothers in the second year after the 1990 birth, employment is basically the same from the third year onward. Return-to-work
1388
QUARTERLY JOURNAL OF ECONOMICS
patterns can be reconciled with employment results as follows. Women eligible for short parental leaves who are planning a further child return to work in the short run temporarily to gain access to renewed parental leave. In contrast, women eligible for long parental leaves can exploit the renewal option and do not have to return. This behavioral pattern explains simultaneously the fact that the fraction ever returning to work is lower in the treated group but long-run employment in months 37–120 remains unchanged. Basically, extending parental leave reduces short-run temporary return to work but does not affect longer-run labor supply decisions. Figure VD displays the evolution of average labor earnings for the two groups. Again, earnings patterns are strikingly asymmetric in periods covering a first-child birth. Whereas women earn about 33 euros per calendar day before birth, mothers of newborn children earn less than 30 euros in all ten postbirth years. In terms of assessing the effects of extended PL on earnings, Figure VD shows that prebirth average earnings are identical between June and July 1990 mothers but diverge strongly immediately after they give birth. Earnings are lower for treated women in the first three years after the 1990 birth. From year four onward, however, treated mothers earn slightly more than control mothers. The positive medium-run earnings effect of extended leave could be driven by various channels: participation in work, length of work, selection into work, and a genuine behavioral effect (more hours, better jobs due to promotions, etc.). Long-run employment results (months 37–120) suggest that the joint effects of participation and length of work are close to zero. There is also a small but insignificant composition effect: women with high prebirth wages return to the job earlier with extended leave than with short parental leave. This implies that the somewhat higher long-run earnings of July 1990 mothers are due to a genuine behavioral effect. Although this effect is small, it seems that those mothers who return after extended leaves work somewhat more and/or are employed in relatively better paid jobs than mothers who return after shorter leaves. Table V uses linear regression to assess the causal effects of the 1990 PL extension for the current child on short- and long-run labor market outcomes. Columns (1)–(5) assess the sensitivity of the results in the same way as the corresponding columns in Table IV on short-run fertility. Treated mothers are significantly less likely to have returned to work three years
PARENTAL LEAVE, FERTILITY, AND RETURN TO WORK
1389
TABLE V LABOR MARKET EFFECTS OF PL DURATION FOR THE CURRENT CHILD, JULY 1990 (24 MONTHS PL) VS. JUNE 1990 (12 MONTHS PL) Base
Controls Half-window Anticipation
−.109 −.11 −.109 (.013)∗∗∗ (.012)∗∗∗ (.017)∗∗∗ [.564] [.564] [.567] Return to work −.03 −.027 −.046 0–120 months (.009)∗∗∗ (.009)∗∗∗ (.013)∗∗∗ [.847] [.847] [.85] Employment −.999 −1.02 −1.001 0–36 months (.073)∗∗∗ (.071)∗∗∗ (.1)∗∗∗ [2.185] [2.185] [2.194] Employment .07 .074 .047 37–120 months (.111) (.109) (.155) [5.136] [5.136] [5.202] Earnings −2.739 −2.998 −3.156 0–36 months (.335)∗∗∗ (.304)∗∗∗ (.429)∗∗∗ [10.159] [10.159] [10.223] Earnings .852 .545 .195 37–120 months (.563) (.522) (.74) [20.759] [20.759] [21.014] Return to work 0–36 months
Observations
6,180
6,180
3,045
Placebo
−.105 .009 (.014)∗∗∗ (.012) [.561] [.619] −.023 .009 (.01)∗∗ (.009) [.845] [.83] −1.031 −.051 (.081)∗∗∗ (.083) [2.183] [2.908] .09 −.09 (.124) (.111) [5.107] [4.889] −3.044 −.821 (.348)∗∗∗ (.321)∗∗ [10.096] [12.155] .522 −.862 (.598) (.518)∗ [20.658] [19.39] 4,757
6,150
Source: ASSD, own calculations. Sample: PL-eligible women giving birth to their first child in June 1–30 (12 months PL) or July 1–30 (24 months PL) in the year 1990. Notes: This table reports the July 1990 parameter estimate in a linear regression/linear probability model comparing postbirth labor market outcomes of mothers giving birth to their first child in June and July 1990. Standard error in parentheses; mean of dependent variable in brackets. Employment and earnings are set to zero for women who do not hold a job. Zeros are included in all our analyses. ∗ (∗∗ ,∗∗∗ ) denote significance at the 10% (5%,1%) level respectively. Inference based on Huber–White standard errors. Base: July (24 months PL) vs. June (12 months PL). Controls: adds controls (Table II). Half-window: June 16–July 15. Anticipation: June 1–23 and July 8–30. Placebo: June 1987 (12 months PL) vs. July 1987 (12 months PL).
after giving birth to their first in July 1990 (row (1)) and the difference is quantitatively large: An additional 10 of 100 mothers have not returned to work within three years after the 1990 birth. This difference in return to work shrinks over time but a significant three-percentage point difference still remains even after ten years (row (2)). Interestingly, although treated mothers work about one month per year less during the first three years after giving birth (row (3)), there are no long-run employment differences between treated and controls. During months 37–120 after birth, average employment is the same for the two groups (row (4)). A similar finding obtains for earnings per calendar month. Treated mothers earn about three euros less from working on the average day of the three first postbirth years (row (5));
1390
QUARTERLY JOURNAL OF ECONOMICS
there is even a positive albeit statistically insignificant earnings differential between treated and control mothers four to ten years after giving birth (months 37–120, row (6)). Thus, although the 1990 reform slightly reduced the number of women ever returning to work, staying out of work for an extended period does not appear to harm employment and earnings of treated mothers. IV.C. Heterogeneous Responses: Income and Occupation Austrian PL provisions offer job protection and a financial transfer during the time a mother stays off work. Although both types of policies reduce the costs of having children, they target quite different dimensions of these costs. Cash transfers help extend the time a mother can stay home with her baby without running into financial distress. This is more likely to help low-income parents. In contrast, job protection extends the time a mother can spend with her baby without losing her job. This is more likely to help career-oriented women, for whom job loss may be very costly. Table VI explores whether extending PL duration affects high- and low-wage women differently. A mother is considered “Hi Wage” if daily earnings on the job held exactly one year prior to giving birth (prebirth job) exceeds the median of daily prebirth earnings (37.12 euros per day worked); and a mother is considered “Lo Wage” otherwise. The flat rate transfer of 340 euros translates into a low replacement rate for high-wage women and a high replacement rate for low-wage women. Moreover, taking occupation as a proxy for the extent of job-specific skills, we also investigate whether the responses for women holding a white-collar occupation one year prior to birth (column (4)) were different from the responses of women holding a blue-collar job (column (5)).24 For comparison purposes, column (1) repeats the baseline result (column (1) of Tables IV and V).25 Results indicate that the 1990 PL reform led to a significant increase in short-run fertility for both high- and low-wage women (Table VI, columns (2) and (3)). Excess short-run fertility amounts 24. Women who did not hold a job one year prior to giving birth to the 1990 child are allocated to the low-wage/blue-collar categories. Results remain qualitatively unchanged if we exclude nonemployed women. 25. Clearly, such rough sample splits along one dimension are likely to be contaminated by imbalance along other dimensions. For instance, 62% of highwage women hold a white-collar occupation, whereas only 44% of all women hold a white-collar occupation. Nevertheless, splitting the sample along these two dimensions allows discussing the relevance of earnings replacement and value of job protection.
PARENTAL LEAVE, FERTILITY, AND RETURN TO WORK
1391
TABLE VI HETEROGENEOUS EFFECTS OF PL DURATION FOR THE CURRENT CHILD (1990 REFORM) All
Hi wage
Lo wage
Wt col
Bl col
(A) Fertility .049 .036 .068 .055 .048 (.012)∗∗∗ (.017)∗∗ (.017)∗∗∗ (.018)∗∗∗ (.016)∗∗∗ [.345] [.351] [.339] [.349] [.342] Additional birth .035 .016 .054 .034 .036 0–120 months (.012)∗∗∗ (.017) (.017)∗∗∗ (.018)∗ (.016)∗∗ [.617] [.616] [.618] [.611] [.622] Additional birth −.026 −.033 −.018 −.018 −.031 0–16 months (.006)∗∗∗ (.009)∗∗∗ (.009)∗ (.009)∗∗ (.009)∗∗∗ [.066] [.06] [.071] [.058] [.072] Additional birth .084 .08 .089 .095 .078 17–28 months (.01)∗∗∗ (.014)∗∗∗ (.014)∗∗∗ (.015)∗∗∗ (.013)∗∗∗ [.193] [.203] [.183] [.206] [.183] Additional birth −.021 −.031 −.013 −.042 −.008 29–120 months (.012)∗ (.017)∗ (.017) (.018)∗∗ (.016) [.36] [.353] [.366] [.348] [.369] (B) Labor market outcomes Return to work −.11 −.103 −.124 −.137 −.094 0–36 months (.012)∗∗∗ (.017)∗∗∗ (.018)∗∗∗ (.018)∗∗∗ (.017)∗∗∗ [.564] [.632] [.496] [.647] [.5] Return to work −.027 −.028 −.029 −.039 −.018 0–120 months (.009)∗∗∗ (.012)∗∗ (.014)∗∗ (.012)∗∗∗ (.013) [.847] [.874] [.82] [.889] [.814] Employment −1.023 −1.186 −1.15 −1.054 −.556 0–36 months (.114)∗∗∗ (.101)∗∗∗ (.114)∗∗∗ (.102)∗∗∗ (.172)∗∗∗ [2.594] [1.984] [2.659] [1.913] [1.505] Employment .2 −.125 .071 .026 .238 37–120 months (.173) (.162) (.171) (.163) (.286) [5.88] [4.762] [5.95] [4.681] [3.92] Earnings −3.548 −2.7 −3.914 −2.347 −2.356 0–36 months (.56)∗∗∗ (.354)∗∗∗ (.564)∗∗∗ (.335)∗∗∗ (.748)∗∗∗ [14.247] [7.183] [13.921] [7.454] [6.508] Earnings 1.142 .311 .45 .895 −.403 37–120 months (.927) (.64) (.923) (.614) (1.465) [26.467] [16.139] [26.336] [16.183] [17.179] Additional birth 0–36 months
Observations
6,180
3,087
3,093
2,705
3,475
Source. ASSD, own calculations. Sample covers women giving birth to their first child in June 1–30 (12 months PL) or July 1–30 (24 months PL) in the year 1990. Notes. This table reports the “July 1990” parameter estimate in linear regressions/linear probability models comparing postbirth labor market outcomes of mothers giving birth to their first child in June and July 1990. Standard error in parentheses; mean of dependent variable in brackets. Employment and earnings are set to zero for women who do not hold a job. Zeros are included in all our analyses. ∗ (∗∗ ,∗∗∗ ) denote significance at the 10% (5%,1%) level, respectively. Inference based on Huber–White standard errors. All: repeats column (2) of Table 4 and column (2) of Table 5 for comparison. Hi/lo wage: median splits for prebirth daily income; Wt/bl col: splits by white- and blue-collar occupation; wage and occupation measured one year prior to birth; women who are not employed one year prior to birth are allocated to the low-wage and blue-collar categories.
1392
QUARTERLY JOURNAL OF ECONOMICS
to almost four children per 100 high-wage mothers and to almost seven children per 100 low-wage mothers (row (1)). This result suggests that taking advantage of automatic renewal is less attractive to high-wage women than to low-wage women. Long-run fertility effects disappear for women with high earnings power (1.6-percentage-point difference) but not for women whose prebirth earnings power lies below the median (6.4-percentage-point difference). Rows (3)–(5) in Table VI assess the timing of fertility. High-wage women delay fertility more than low-wage women up to month 16 (when control women also have access to automatic renewal, row (3)). From month 17 to 28 (when the treated have access to automatic renewal but controls do not; row (4)), treated high- and low-wage women display excess fertility of eight and nine children per 100 women, respectively. In the period following month 29, high-wage (but not low-wage) controls have significantly higher fertility (row (5)). In sum, excess fertility for low-wage women results from not delaying initial fertility by the treated and from not catching up by the controls after automatic renewal has lapsed. These differences in fertility timing are likely to be explained by differences in work attachment between high-wage women (63% return to work within three years, row (6), number in brackets) and low-wage women (only 50% return to work within three years, row (6), number in brackets). Because returning to work between children induces a wider space between first birth and second birth, offering automatic renewal compresses the space between children more strongly for the group with a larger ex ante space between births. Interestingly, although there are significant differences in fertility responses between high- and low-wage women, the PL reform affects employment and earnings of high- and low-wage women to a similar extent (Panel B in Table VI). The decrease in return-to-work probabilities is somewhat smaller for high-wage women than for low-wage women in the short run but almost identical in the long run (rows (6) and (7)). The reduction in employment is identical for both groups in the short run and in the long run (rows (8) and (9)). Short-run earnings reductions are greater for high-wage women than for low-wage women (row (10)). However, because employment responses are nearly identical, differential earnings responses arguably reflect ex ante differences in earnings power rather than differential earnings consequences of extending PL duration. There are no significant
PARENTAL LEAVE, FERTILITY, AND RETURN TO WORK
1393
long-run earnings consequences of extended career interruptions (row (11)). Turning to fertility results by occupation reveals that shortrun and long-run fertility responses are quite similar for whitecollar and blue-collar women (columns (4) and (5), rows (1) and (2)): three additional children within 10 years. Yet even though the long-run result is similar, the time pattern of the responses differs somewhat between white- and blue-collar women (rows (3)–(5)). Both blue- and white-collar women delay second-child fertility in the period, giving both treated and control women access to automatic renewal (row (3)); both blue- and white-collar women eligible for extended PL concentrate births of second children in the period with access to automatic renewal (row (4)). Yet whitecollar control women catch up to treated women more strongly than blue-collar control women in the postrenewal period (row (5)). This pattern of results is, again, consistent with differential postbirth labor market attachment between white-collar women (65% return to job within three years, row (6)) and blue-collar women (50% return to job within three years, row (6)). Although occupation does not appear to mediate fertility responses of extended leave strongly, occupation is important for the labor market consequences of extended PL (Panel B in Table VI). About 14 of 100 treated white-collar women do not return to work within three years because of extended PL. In contrast, only 9 of 100 blue-collar women delay return to work in the short run (row (6)). Long-run return to work of blue-collar women is not affected, whereas almost 4 of 100 women in the white-collar group do not return to work within ten years (row (7)). Extended PL reduces employment and earnings more strongly for white-collar women than for blue-collar women in the short run (rows (8) and (10)). However, in the long run PL-induced career interruptions are not harmful (rows (9) and (11)). In sum, the results suggest that labor market outcomes of white-collar women are more sensitive to extending PL. We conclude that the finding of higher fertility responses for low-wage women than for high-wage women suggests that cash transfers (through their impact on replacement ratios) are important determinants of fertility responses. Finding that there are no differences in fertility responses between blue- and whitecollar workers suggests that the job protection provisions are of importance for white-collar women. White-collar women have, on average, higher incomes than blue-collar women and, on the basis
1394
QUARTERLY JOURNAL OF ECONOMICS
of their lower average replacement ratios, they should also react less strongly to the PL extension. It seems that lower replacement ratios are compensated by the benefits of job protection. This interpretation is consistent with the idea that internal labor markets and career concerns are more important for white-collar jobs but less relevant for blue-collar workers.26 V. EXTENDING PL DURATION FOR THE FUTURE CHILD This section assesses the effect of extended leave on the future child (“future-child PL effect”).27 To identify this effect we compare the June 1990 cohort (eligible for a one-year PL for the current child but for a two-year PL for the future child) to the June 1987 cohort (one-year PL for the current child and one-year PL for any second child born within three years after the first birth). The June 1990 to June 1987 contrast may be biased due to cohort effects and time trends. Hence Table VII presents a range of supplementary analyses that shed light on the plausibility of the key identifying assumption of identical ex ante fertility plans and labor market trajectories for two cohorts. Assessing the sensitivity of our results to time trends, we provide (i) a placebo estimate of the reform among PL-ineligible mothers (column (3)), (ii) estimates of a three-year postreform time trend that compares July 1993 mothers to July 1990 mothers (column (4)), and (iii) estimates of a two-year prereform time trend that compares June 1987 mothers to June 1985 mothers (column (5)).28 Moreover, 26. Our results also relate to the existing literature on the trade-off between fertility and labor supply. In contrast to our long-run results, Angrist and Evans (1998) find that U.S. women with two children worked less than women with just one child in 1990. There are at least two reasons for the differences in our results. First, Angrist and Evans (1998) do not condition on time since birth. Their finding of a reduction in labor for the average mother could be consistent with a temporary reduction in labor supply in the short run but no reduction of labor supply in the long run—the situation we document for Austria. Second, Austria and the United States differ in terms of female labor force participation. In 1994, the earliest year with comparable OECD statistics, 65% of American working-age women participated in the labor market whereas only 59% of Austrian women did. Because these participation differences presumably reflect differences in the speed of postbirth labor market reentry, a second child is likely to crowd out more employment in the United States than in Austria. Thus, the fertility effects of extended parental leave may come at higher long-run employment cost in countries with high postbirth labor market participation. 27. Note that, although the change in the PL system could in principle also affect first-child, third-child fertility, and so forth, we confine our analysis to the analysis of second-child fertility because for this parity the effects should be most pronounced. 28. The prereform time trend cannot be three years because our ASSD extract starts in 1985.
1395
PARENTAL LEAVE, FERTILITY, AND RETURN TO WORK TABLE VII THE EFFECT OF PL DURATION FOR THE FUTURE CHILD (1990 REFORM) Base Additional birth
Return to work
Employment
Earnings
Controls
Ineligible
Posttrend
Pretrend
(A) Short-run effects (0–36 months after birth) .069 .068 −.019 −.006 (.012)∗∗∗ (.012)∗∗∗ (.051) (.012) [.286] [.286] [.236] [.364] −.002 −.003 −.053 .008 (.013) (.012) (.038) (.012) [.622] [.622] [.135] [.524] −.183 −.164 −.336 −.103 (.084)∗∗ (.083)∗∗ (.216) (.058)∗ [2.799] [2.799] [.583] [1.71] −.181 −.229 .512 −.75 (.361) (.331) (1.927) (.282)∗∗∗ [11.68] [11.68] [9.031] [8.974]
.001 (.011) [.25] .018 (.012) [.614] −.045 (.086) [2.909] −.589 (.325)∗ [11.849]
(B) Medium-run effects (37–72 months after birth) Additional birth −.019 −.014 −.072 −.011 (.011)∗ (.011) (.047) (.01) [.21] [.21] [.182] [.179] Return to work .028 .024 .011 .022 (.009)∗∗∗ (.01)∗∗ (.043) (.011)∗∗ [.158] [.158] [.156] [.227] Employment −.172 −.191 −.492 .315 (.123) (.122) (.397) (.117)∗∗∗ [4.585] [4.585] [1.599] [4.77] Earnings −.13 −.325 −.252 .441 (.547) (.523) (2.285) (.524) [17.015] [17.015] [13.084] [18.654] Observations
5,977
5,977
274
6,406
.033 (.011)∗∗∗ [.205] .005 (.009) [.144] −.007 (.125) [4.688] .053 (.498) [16.814] 5,892
Source: ASSD, own calculations. Sample covers eligible and ineligible women giving birth to their first child in June or July of the years listed in the notes. Notes: This table reports the “After” parameter estimate in linear regressions/linear probability models comparing postbirth labor market outcomes of mothers giving birth to their first child in June or July of various years. Standard error in parentheses; mean of dependent variable in brackets. Employment and earnings are set to zero for women who do not hold a job. Zeros are included in all our analyses. ∗ (∗∗ ,∗∗∗ ) denote significance at the 10% (5%,1%) level, respectively. Inference based on Huber–White standard errors. Base: eligible, June 1990 (24 months PL for second child, 12 months PL for first child) and June 1987 (12 months PL for first and second child). Controls: adds controls (Table II) to Base. Ineligible: ineligible with controls, June 1990 vs. June 1987. Posttrend: eligible with controls, June 1993 vs. July 1990. Pretrend: eligible with controls, June 1987 vs. June 1985.
Table VII distinguishes between months 0–36 after the first birth (where second-child duration differs) and months 37–72 after the first birth (where second-child duration is identical).29 29. Note that the inverse pattern of eligibility holds for the pre- and postreform trend cohorts. Prereform cohorts are eligible for the same second-child PL duration during months 0–36 after the first birth, but PL duration differs
1396
QUARTERLY JOURNAL OF ECONOMICS
Results indicate that the future-child PL effect is quantitatively important (Table VII). Short-run fertility is 7 percentage points higher for June 1990 mothers than for June 1987 mothers (Panel A of Table VII, row (1)). Adding prebirth characteristics does not change this result. We do not find that PL-ineligible June 1990 women tend to give birth to more future children than PL-ineligible June 1987 women (column (3)). We also do not find a high importance of time trends. Higher-order fertility is similar between July 1993 and July 1990 mothers (column (4)) and is also similar between between June 1987 and June 1985 mothers (column (5)). Interestingly, extending leave for the future child does not appear to affect labor market outcomes to any great extent (rows (2)–(4) of Panel A in Table VII). Return to work and earnings do not display statistically significant effects (rows (2) and (4)). Although work experience is reduced significantly, estimates indicate that treated mothers work 0.2 months per year less.30 Panel B in Table VII shows that the short-run fertility effect persists in the medium run. Our results indicate that there is no significant catch-up of control mothers during months 36–72. Although June 1990 mothers give birth to slightly fewer second children than June 1987 mothers, the effect is quite small (1.4 children per 100 women) and not statistically significant. Furthermore, results on PL-ineligible women are insignificant. The last two columns of Panel B of Table VII show time-trend results. More precisely, these results measure time trends plus futurechild differences in PL duration in the medium run. For instance, pretrend estimates compare June 1987 mothers who are covered by a two-year leave in the medium run (months 36–72 cover the period from July 1990 to June 1993) and June 1985 mothers who are covered by a one-year leave for the first two years and a two-year leave for the third year (months 36–72 cover the period from July 1988 to June 1991). Consistent with this pattern of PL eligibility, during months 37–72 after the first birth. In the prereform time trend analysis, for instance, second children of June 1987 mothers are eligible for 24 months of PL duration in months 37–60 after the first birth, whereas second children of June 1985 mothers are still only eligible for 12 months of PL duration. In months 61–72 after the first birth, both second children of both cohorts are eligible for 24 months of PL duration. 30. Note that employment estimates could be spurious because there is a significantly negative postreform trend in employment (column (3), Table VII). Moreover, the coefficients on labor market outcomes are very imprecise, so the data are consistent with zero effects but also with large negative effects on employment and earnings.
PARENTAL LEAVE, FERTILITY, AND RETURN TO WORK
1397
the future-child PL effect is 3 percentage points higher for the June 1987 cohort than for the June 1985 cohort.31 Posttrend estimates capture the fact that second-child leave is reduced from 24 months to 18 months for July 1993 mothers, whereas July 1990 mothers still have access to a two-year leave. Point estimates are quantitatively consistent with the future-child effects estimated above but are not statistically significant. Except for a small increase in return to work, we do not find large future-child PL effect on medium-run labor market outcomes. What can we learn from the current-child and future-child effects? Recall that the current-child estimates compare families with different benefits (in terms of transfers and time for care) for the current child but identical benefits for the future child. Abstracting from automatic renewal, extending parental leave should crowd out short-run postbirth labor market participation but leave fertility decisions unaffected. In contrast, futurechild estimates compare families with different benefits for the future child but identical benefits for the current child. Extending parental leave should affect fertility decisions but not crowd out short-run postbirth labor market participation. Turning to results, we find that extending parental leave for the current child reduces short-run postbirth work experience by one month, whereas the corresponding future-child effect is about one-fifth of a month. Thus, the pattern of labor market results is in line with the pattern of incentives. In contrast, whereas extending parental leave for the future child boosts short-run fertility by 7 percentage points, doing so for the current child also increases short-run fertility by 5 percentage points. Thus, fertility results suggest that access to automatic renewal is valuable; indeed almost as valuable as extended leave for a future child. VI. REDUCING PL DURATION This section discusses the effects of the 1996 reduction of PL. Results comparing mothers giving birth to their first child in July 1996 (eligible for eighteen months of leave) with mothers 31. The effect is about one-half of the short-run estimate in row (1), column (2), of Table VII. This lower importance of extended leave for a future child can probably be explained by two facts. First, the prereform control group of June 1985 mothers gets access to extended leave in the period 61–72 months after birth. Second, mothers might be less responsive to extended leave three to six years after birth than zero to three years after birth.
1398
QUARTERLY JOURNAL OF ECONOMICS
TABLE VIII THE EFFECTS OF REDUCING PL DURATION FOR THE CURRENT CHILD (1996 REFORM): JUNE 1996 (24 MONTHS PL) VS. JULY 1996 (18 MONTHS PL) Base
Controls
Half-window Anticipation
(A) Fertility −.001 −.025 (.012) (.017) [.321] [.331] .028 .023 (.009)∗∗∗ (.013)∗ [.152] [.157] −.029 −.032 (.008)∗∗∗ (.011)∗∗∗ [.103] [.109] .005 −.013 (.007) (.01) [.077] [.076]
.003 (.014) [.309] .022 (.01)∗∗ [.148] −.028 (.009)∗∗∗ [.098] .013 (.008)∗ [.076]
(B) Labor market outcomes .051 .053 .058 (.012)∗∗∗ (.012)∗∗∗ (.017)∗∗∗ [.661] [.661] [.66] Employment .675 .684 .703 0–36 months (.069)∗∗∗ (.067)∗∗∗ (.095)∗∗∗ [2.456] [2.456] [2.46] Earnings 2.65 2.697 2.295 0–36 months (.367)∗∗∗ (.321)∗∗∗ (.444)∗∗∗ [12.302] [12.302] [12.148]
.047 (.014)∗∗∗ [.663] .676 (.076)∗∗∗ [2.487] 3.045 (.371)∗∗∗ [12.469]
Additional birth −.003 0–36 months (.012) [.321] Additional birth .028 0–22 months (.009)∗∗∗ [.152] Additional birth −.03 23–28 months (.008)∗∗∗ [.103] Additional birth .004 29–36 months (.007) [.077] Return to work 0–36 months
Placebo −.021 (.012)∗ [.349] −.021 (.009)∗∗ [.151] .001 (.008) [.122] −.003 (.007) [.084] .003 (.012) [.536] .057 (.056) [1.729] −.267 (.264) [9.135]
Source: ASSD, own calculations. Sample: PL-eligible women giving birth to their first child in June 1–30 (24 months PL) or July 1–30 (18 months PL) in the year 1996. Notes. This table reports the “July 1996” parameter estimate in linear regressions/linear probability models comparing outcomes of mothers giving birth to their first child in June or July 1996. Standard error in parentheses; mean of dependent variable in brackets. Employment and earnings are set to zero for women who do not hold a job. Zeros are included in all our analyses. ∗ (∗∗ ,∗∗∗ ) denote significance at the 10% (5%,1%) level, respectively. Inference based on Huber–White standard errors. Base: July (18 months PL) vs. June (24 months PL). Controls: adds controls (Table II). Half-window: June 16–July 15. Anticipation: June 1–23 and July 8–30. Placebo: June 1993 (24 months PL) vs. July 1993 (24 months PL).
giving birth to their first child in June 1996 (eligible for 24 months of leave) indicate, first, that the number of children born within three years is not affected by the partial reversal of the 1990 policy change (Table VIII, row (1)). Second, although the number of children is unaffected, the timing of second-child fertility is significantly altered. There is excess future-child fertility of about 3% before month 22 (when both treated and control mothers have access to PL renewal) and a decrease of the same order of magnitude during months 23–28 (when the treated lose access to PL renewal; Table VIII, rows (2)–(4)). Reducing PL duration strongly affects
PARENTAL LEAVE, FERTILITY, AND RETURN TO WORK
1399
the timing of births but not the number of children being born because, arguably, mothers could take advantage of PL renewal before and after the 1996 reform, whereas the 1990 reform represents a switch from a system where PL renewal was not feasible to a system where it become highly attractive.32 Third, reducing PL duration affects return to work, employment, and earnings considerably. In the short run, 5 of 100 women return to work within three years who would not with extended leave (Table VIII, row (5)). Women on the 18-month leave also work on the average about 0.7 months more per year and earn 3 euros more per day more than women with access to a two-year PL (Table VIII, rows (6) and (7)). Notice that the six-month reduction in PL duration affects return to work, employment, and earnings by about half as much (in absolute value) as the twelvemonth extension of PL duration in 1990. Hence results for the 1996 reform confirm that that PL duration for the current child has a strong impact on short-run labor market outcomes. VII. CONCLUSIONS The focus of this paper is the relevance of the duration of job-protected, paid PL for higher-order fertility and labor market outcomes of working women. The empirical analysis is based on a 1990 extension of PL duration from one year to two years and on a 1996 reduction of PL from two years to eighteen months. We find that extending PL affects fertility via two channels. First, increasing leave for the current child opens up the possibility of renewing PL benefits without going back to work. The resulting tighter spacing of births gives rise to both excess shortrun fertility (5 additional children per 100 women within three years) and excess long-run fertility (3 additional children per 100 women within ten years). Moreover, increasing leave for the future child reduces the cost of care for that child, inducing mothers to give birth to about 7 additional second children per 100 women. This means that extending job-protected paid PL with automatic renewal from one year to two years induces mothers to give birth to about 12 additional children per 100 women. Regarding the labor market consequences of extended leave, we find that most 32. We also investigate the effects of reducing PL duration for the future child by comparing mothers who give birth to a first child in June 1996 to mothers who give birth to a first child in June 1993. Findings indicate that reduced leave on the future child is associated with a decrease in short-run fertility.
1400
QUARTERLY JOURNAL OF ECONOMICS
mothers exhaust the full duration of their leaves; that return to work is substantially delayed even after PL has been exhausted; and that prolonging leave leads to significant short-run reductions in employment and earnings but only minor effects in the long run. Fertility and labor market responses are heterogeneous with respect to earnings and occupation on the prebirth job. This is consistent with both replacement rates and job protection mediating the effects of extended leave on fertility and labor market responses. Finally, our findings indicate that the 1996 reduction of PL duration compresses the space between first and second births, but does not have a significant effect on the number of second children born within three years. Moreover, the labor market responses closely mirror those of the 1990 extension of PL duration. Providing causal evidence on how Austrian policy changes affect fertility and labor market careers is interesting and important for the non-Austrian public. In many countries fertility levels have fallen strongly and the question of whether PL policies can help to increase fertility is hotly debated. Our results show that such policies can have a quite strong impact and that both transfers and job protection matter for fertility responses. Our analysis of labor market effects addresses the issue of whether too generous PL rules might have a negative impact on mothers’ subsequent work careers—an issue of paramount importance. We think the Austrian case is interesting in this respect because the 1990 PL reform was a move from a system of average generosity (by current OECD standards) to a system of high generosity. Although we do find that the PL extension increases the proportion of women who never return to work, we do not find detrimental effects on employment and earnings over an extended time horizon. Hence we conclude that generous PL policies do not necessarily harm women’s long-run labor market outcomes. FACULTY OF BUSINESS AND ECONOMICS, UNIVERSITY OF LAUSANNE, CEPR, IFAU, IZA, CESIFO, AND IEW INSTITUTE FOR EMPIRICAL ECONOMIC RESEARCH, UNIVERSITY OF ZURICH, CEPR, IZA, AND CESIFO
REFERENCES Albrecht, James W., Per-Anders Edin, Marianne Sundstr¨om, and Susan B. Vroman, “Career Interruptions and Subsequent Earnings: A Reexamination Using Swedish Data,” Journal of Human Resources, 34 (1998), 294–311.
PARENTAL LEAVE, FERTILITY, AND RETURN TO WORK
1401
Angrist, Joshua D., and William N. Evans, “Children and Their Parents’ Labor Supply: Evidence from Exogenous Variation in Family Size,” American Economic Review, 88 (1998), 450–477. Averett, Susan L., and Leslie A. Whittington, “Does Maternity Leave Induce Births?” Southern Economic Journal, 68 (2001), 403–417. Baker, Michael, Jonathan Gruber, and Kevin Milligan, “Universal Childcare, Maternal Labor Supply, and Family Well-Being,” Journal of Political Economy, 116 (2008), 709–745. Baker, Michael, and Kevin Milligan, “Maternal Employment, Breastfeeding, and Health: Evidence from Maternity Leave Mandates,” Journal of Health Economics, 27 (2008), 871–887. Baum, Charles L., “The Effect of State Maternity Leave Legislation and the 1993 Family and Medical Leave Act on Employment and Wages,” Labour Economics, 10 (2003), 573–596. Berger, Lawrence M., Jennifer Hill, and Jane Waldfogel, “Maternity Leave, Early Maternal Employment and Child Health and Development in the US,” Economic Journal, 115 (2005), F29–F47. Berger, Lawrence M., and Jane Waldfogel, “Maternity Leave and the Employment of New Mothers in the United States,” Journal of Population Economics, 17 (2004), 331–349. Bj¨orklund, Anders, “Does a Family-Friendly Policy Raise Fertility Levels?” Swedish Institute for European Studies Report No. 3, 2007. Dickert-Conlin, Stacy, and Amitabh Chandra, “Taxes and the Timing of Births,” Journal of Political Economy, 107 (1999), 161–177. Dustmann, Christian, and Uta Sch¨onberg, “The Effect of Expansions in Maternity Leave Coverage on Children’s Long-Term Outcomes,” IZA Discussion Paper No. 3605, 2008. Gans, Joshua S., and Andrew Leigh, “Born on the First of July: An (Un)natural Experiment in Birth Timing,” Australian National University Discussion Paper No. 529, 2006. Hahn, Jinyong, Petra Todd, and Wilbert van der Klaauw, “Identification and Estimation of Treatment Effects with a Regression–Discontinuity Design,” Econometrica, 69 (2001), 201–209. ˚ Schøne, “Cash for Care: More Work for the Stork?” Mimeo, Hardoy, Ines, and Pal Institute for Social Research Oslo, 2005. Hoem, Jan M., “Public Policy as the Fuel of Fertility: Effects of a Policy Reform on the Pace of Childbearing in Sweden in the 1980s,” Acta Sociologica, 36 (1993), 19–31. Hoynes, Hilary W., “Work, Welfare, and Family Structure,” in Fiscal Policy: Lessons From Economic Research, Alan B. Auerbach, ed. (Cambridge, MA: MIT Press, 1997). Joyce, Theodore, Robert Kaestner, Sanders Korenman, and Stanley Henshaw, “Family Cap Provisions and Changes in Births and Abortions,” NBER Working Paper No. W10214, 2004. Kearney, Melissa Schettini, “Is There an Effect of Incremental Welfare Benefits on Fertility Behavior? A Look at the Family Cap,” Journal of Human Resources, 39 (2004), 295–325. Klerman, Jacob A., and Arleen Leibowitz, “Labor Supply Effects of State Maternity Leave Legislation,” in Gender and Family Issues in the Workplace, Francine Blau and Ron Ehrenberg, eds. (New York: Russell Sage Press, 1997). ——, “Job Continuity among New Mothers,” Demography, 36 (1999), 145– 155. ¨ Lalive, Rafael, and Josef Zweimuller, “Parental Leave and Mothers’ Post-Birth Careers,” Mimeo, University of Lausanne, 2007. Laroque, Guy, and Bernard Salani´e, “Fertility and Financial Incentives in France,” CEPR Discussion Paper No. 4064, 2005. Milligan, Kevin, “Subsidizing the Stork: New Evidence on Tax Incentives and Fertility,” Review of Economics and Statistics, 87 (2005), 539–555. Moffitt, Robert A., “Profiles of Fertility, Labour Supply and Wages of Married Women: A Complete Life-Cycle Model,” Review of Economic Studies, 51 (1984), 263–278.
1402
QUARTERLY JOURNAL OF ECONOMICS
——, “The Effect of Welfare on Marriage and Fertility,” in Welfare, the Family, and Reproductive Behavior, Robert A. Moffitt, ed. (Washington, DC: National Academy Press, 1998). Piketty, Thomas, “L’Impact de l’Allocation Parentale d’Education sur l’Activit´e Feminine et la Fecondit´e, 1982–2002,” CEPREMAP Working Papers (Couverture Orange) No. 2003-09, 2003. Rosenzweig, Mark R., “Welfare, Marital Prospects, and Nonmarital Childbearing,” Journal of Political Economy, 107 (1999), S3–S32. Ruhm, Christopher, “The Economic Consequences of Parental Leave Mandates: Lessons from Europe,” Quarterly Journal of Economics, 113 (1998), 285–317. Ruhm, Christopher, and Jackqueline L. Teague (1997) “Parental Leave Policies in Europe and North America,” in Gender and Family Issues in the Workplace, Francine D. Blau and Ronald Ehrenberg, eds. (New York: The Russell Sage Foundation Press, 1997). Sch¨onberg, Uta, and Johannes Ludsteck, “Maternity Leave Legislation, Female Labor Supply, and the Family Wage Gap,” IZA Discussion Paper No. 2699, 2007.