FINANCE NOTES Mike Cliff Current Draft: June 30, 1998
Contents 1 Introduction
1
2 Asset Pricing 2.1 Introduction . . . . . . . . . . . . . . . . . . 2.2 Portfolio Theory . . . . . . . . . . . . . . . 2.2.1 Single Period Optimization Problem 2.2.2 Key Results . . . . . . . . . . . . . . 2.2.3 Multiperiod Portfolio Choice . . . . . 2.3 Equilibrium Asset Pricing Theory . . . . . . 2.3.1 Utility Functions . . . . . . . . . . . 2.3.2 CAPM Theory . . . . . . . . . . . . 2.3.3 ICAPM Theory . . . . . . . . . . . . 2.3.4 CCAPM Theory . . . . . . . . . . . 2.3.5 The CIR Model . . . . . . . . . . . . 2.4 Arbitrage Asset Pricing . . . . . . . . . . . . 2.4.1 State Contingent Claims . . . . . . . 2.4.2 Arbitrage Pricing Theory . . . . . . . 2.5 Pricing Kernel Approach . . . . . . . . . . . 2.5.1 Basics . . . . . . . . . . . . . . . . . 2.5.2 Different Expectations . . . . . . . . 2.5.3 Asset Pricing with m . . . . . . . . . 2.5.4 The Agent’s Problem . . . . . . . . . 2.5.5 The Main Results . . . . . . . . . . . 2.5.6 Hansen-Jagannathan Bounds . . . . 2.6 Conditioning Information . . . . . . . . . . . 2.7 Market Efficiency . . . . . . . . . . . . . . . 2.8 Empirical Asset Pricing . . . . . . . . . . . 2.8.1 Properties of Asset Returns . . . . . i
. . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . .
3 3 4 4 5 6 7 8 9 11 15 16 20 20 21 23 24 25 26 26 27 28 29 30 31 31
ii
CONTENTS 2.8.2 2.8.3 2.8.4 2.8.5 2.8.6
General Procedures . . . CAPM Tests . . . . . . ICAPM/CCAPM Tests . APT Tests . . . . . . . . Present Value Relations
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
3 Fixed Income 3.1 Introduction . . . . . . . . . . . . . . . . . 3.2 Term Structure Basics . . . . . . . . . . . 3.3 Inflation and Returns . . . . . . . . . . . . 3.4 Forward Rates . . . . . . . . . . . . . . . . 3.5 Bond Pricing . . . . . . . . . . . . . . . . 3.6 Affine Models . . . . . . . . . . . . . . . . 3.6.1 Vasicek . . . . . . . . . . . . . . . 3.6.2 The CIR Model . . . . . . . . . . . 3.6.3 Duffie-Kan Class . . . . . . . . . . 3.6.4 Other Single Factor Models . . . . 3.6.5 Alternatives . . . . . . . . . . . . . 3.7 Multi-Factor Models . . . . . . . . . . . . 3.8 Empirical Tests . . . . . . . . . . . . . . . 3.8.1 Brown & Dybvig (1986) . . . . . . 3.8.2 Brown & Schaefer (1994) . . . . . . 3.8.3 Chan, Karolyi, Longstaff & Sanders 3.8.4 Gibbons & Ramaswamy (1993) . . 3.8.5 Pearson & Sun (1994) . . . . . . . 3.8.6 Longstaff & Schwartz (1992) . . . . 4 Derivatives 4.1 Introduction . . . . . . . . . . . . 4.2 Binomial Models . . . . . . . . . 4.2.1 Alternative Derivations . . 4.2.2 Trinomial Models . . . . . 4.3 Black Scholes Model . . . . . . . 4.3.1 Black Scholes Derivations 4.3.2 Implied Volatilities . . . . 4.3.3 Hedging . . . . . . . . . . 4.4 Advanced Topics . . . . . . . . . 4.4.1 American Options . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . (1992) . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . .
36 37 40 41 42
. . . . . . . . . . . . . . . . . . .
45 45 45 45 45 46 47 49 49 50 51 51 51 51 51 53 53 54 54 54
. . . . . . . . . .
55 55 55 57 60 60 60 64 64 64 64
CONTENTS
4.5
4.4.2 Exotic Options . . . . . . . . . . 4.4.3 Other Advanced Topics . . . . . . Interest Rate Derivatives . . . . . . . . . 4.5.1 Stochastic Interest Rate Models . 4.5.2 Stochastic Term Structure Models
iii . . . . .
5 Corporate Finance 5.1 Introduction . . . . . . . . . . . . . . . . . 5.2 Information Asymmetry/Signaling . . . . . 5.3 Agency Theory . . . . . . . . . . . . . . . 5.4 Capital Structure . . . . . . . . . . . . . . 5.5 Dividends . . . . . . . . . . . . . . . . . . 5.5.1 Factors Influencing Dividend Policy 5.5.2 Key Dividends Papers . . . . . . . 5.6 Corporate Control . . . . . . . . . . . . . 5.7 Mergers and Acquisitions . . . . . . . . . . 5.7.1 Tender Offers . . . . . . . . . . . . 5.7.2 Competition Among Bidders . . . . 5.7.3 Managerial Power . . . . . . . . . . 5.7.4 Key Papers . . . . . . . . . . . . . 5.8 Financial Distress . . . . . . . . . . . . . . 5.8.1 Factors Affecting Reorganizations . 5.8.2 Private Resolution . . . . . . . . . 5.8.3 Formal Resolution . . . . . . . . . 5.8.4 Key Papers . . . . . . . . . . . . . 5.9 Equity Issuance . . . . . . . . . . . . . . . 5.9.1 Flotation Methods . . . . . . . . . 5.9.2 Direct Flotation Costs . . . . . . . 5.9.3 Indirect Flotation Costs . . . . . . 5.9.4 Valuation Effects . . . . . . . . . . 5.9.5 SEO Timing . . . . . . . . . . . . . 5.9.6 Key Papers . . . . . . . . . . . . . 5.10 Initial Public Offerings . . . . . . . . . . . 5.10.1 IPO Anomalies . . . . . . . . . . . 5.10.2 Key Papers . . . . . . . . . . . . . 5.11 Executive Compensation . . . . . . . . . . 5.12 Risk Management . . . . . . . . . . . . . . 5.13 Internal/External Markets and Banking . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . .
66 67 67 68 68
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
71 71 71 75 79 87 88 89 95 100 100 103 103 104 107 108 109 110 111 114 116 117 119 119 120 121 126 128 130 133 138 143
iv
CONTENTS 5.14 Convertible Debt . . . . . . . . . . . . . . . . . . . . . . . . . 147 5.15 Imperfections and Demand . . . . . . . . . . . . . . . . . . . . 151 5.16 Financial Innovation . . . . . . . . . . . . . . . . . . . . . . . 155
6 Market Microstructure 6.1 Introduction . . . . . . . . . . . . . . 6.2 The Value of Information . . . . . . . 6.3 Single Period REE . . . . . . . . . . 6.4 Batch Models . . . . . . . . . . . . . 6.4.1 Strategic Uninformed Traders 6.5 Sequential Trade Models . . . . . . . 6.5.1 Specialists and Dealers . . . . 6.5.2 Other Topics . . . . . . . . . 6.6 Special Topics . . . . . . . . . . . . . 6.6.1 Bubbles . . . . . . . . . . . . 6.6.2 Speculation . . . . . . . . . . 6.6.3 Noise . . . . . . . . . . . . . . 6.6.4 Cascades . . . . . . . . . . . . 7 International Finance 7.1 Introduction . . . . . . . . 7.2 Spot Currency Pricing . . 7.3 Forward Currency Pricing 7.4 Integration . . . . . . . . . 7.5 International Asset Pricing 7.6 Other Topics . . . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
8 Appendix: Math Results 8.1 Basics . . . . . . . . . . . . . . . . . 8.1.1 Norms . . . . . . . . . . . . . 8.1.2 Moments . . . . . . . . . . . . 8.1.3 Distributions . . . . . . . . . 8.1.4 Convergence . . . . . . . . . . 8.1.5 Some Famous Inequalities . . 8.1.6 Stein’s Lemma . . . . . . . . 8.1.7 Bayes Law . . . . . . . . . . . 8.1.8 Law of Iterated Expectations 8.1.9 Stochastic Dominance . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . .
159 . 159 . 160 . 161 . 166 . 170 . 173 . 173 . 176 . 176 . 176 . 177 . 178 . 178
. . . . . .
179 . 179 . 180 . 181 . 184 . 184 . 184
. . . . . . . . . .
185 . 185 . 185 . 185 . 185 . 186 . 186 . 187 . 187 . 187 . 187
CONTENTS 8.2
8.3
Econometrics . . . . . . . . . . . . . . . . . . . . . 8.2.1 Projection Theorem . . . . . . . . . . . . . . 8.2.2 Cramer-Rao Bound and the Var-Cov Matrix 8.2.3 Testing: Wald, LM, LR . . . . . . . . . . . . Continuous-Time Math . . . . . . . . . . . . . . . . 8.3.1 Stochastic Processes . . . . . . . . . . . . . 8.3.2 Martingales . . . . . . . . . . . . . . . . . . 8.3.3 Itˆo’s Lemma . . . . . . . . . . . . . . . . . . 8.3.4 Cameron-Martin-Girsanov Theorem . . . . . 8.3.5 Special Processes . . . . . . . . . . . . . . . 8.3.6 Special Lemma . . . . . . . . . . . . . . . .
References
v . . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
188 188 188 188 189 189 189 189 189 190 191 192
vi
CONTENTS
Chapter 1 Introduction These notes are an effort to integrate the body of knowledge encountered in the Finance PhD classes at the University of North Carolina. The original draft of this document was developed to prepare for the area comprehensive exams. As such, the presentation is in a condensed format. The theoretical models are derived in a skeleton form, with the focus on the set up and key steps rather than line-by-line explanations. Similarly, empirical work is summarized in terms of purpose, methodology (where important), findings, and fit with the literature. Throughout the manuscript special attention is given to tying together the ideas in differerent models and areas. Providing this structure should make the material easier to remember and more meaningful to interpret. The organization of the paper is as follows. Chapter 2 covers asset pricing, both theoretical and empirical. A separate chapter on fixed income secutities follows. Chapter 4 coveres derivative securities, from both a binomial perspective and a continuous time framework. Chapter 5 covers the main topics in corporate finance, again both theoretically and empirically. Next is a chapter on market microstructure and information economics, which is largely theoretical. A chapter on international finance concludes the main body of the document. Lastly is a chapter covering important mathematical, statistical, and econometric issues. An effort is made to preserve notational consistency, but inevitably there will be deviations. Bold is used for vectors x and matrices X. Time subscripts are dropped unless needed for clarity. Generally t is the current time, T is the end, and τ is the time between two dates. Random variables are given a tilde only as needed. Expectations are with respect to the true probabilities 1
2
CHAPTER 1. INTRODUCTION
P unless otherwise denoted. The risk-neutral measure is represented by Q. R denotes a gross return, whereas r is a net return. When working with the pricing kernel models it is useful to have a notation for element-wise operations. I use to denote element-by-element multiplication and to represent such a division. This document draws heavily from a variety of sources, including Bhattacharya and Constantinides (1989), Cochrane (1998), Campbell, Lo, and MacKinlay (1997), Huang and Litzenberger (1988), Ingersoll (1987), Jarrow, Maksimovic, and Ziemba (1995), as well as lecture notes from Dong-Hyun Ahn, Jennifer Conrad, and Dick Rendleman.
June 30, 1998 Mike Cliff
[email protected]
Chapter 2 Asset Pricing 2.1
Introduction
There are three primary apporaches to pricing assets. The equilibrium approach begins with agents preferences (e.g., over expected returns or consumption). Agents maximize expected utility subject to budget constraints and market clearing conditions. Equilibrium models price all assets simultaneously and in equilibrium there is no arbitrage. The arbitrage approach takes a different point of view. It takes as given the prices of basis assets, which can be combined to generate other payoffs. The absence of arbitrage implies unique prices for these synthetic assets when markets are (locally) complete. If markets are incomplete, it may be the case that there is a range of admissable prices. Unfortunately, it is generally not possible to recover a supporting equilibrium from the arbitrage approach. Somewhat paradoxically, the arbitrage approach may in fact admit arbitrage opportunities in the sense that selecting different basis assets may give different prices. The final approach focuses on the pricing kernel. This approach shares many of the features of the first two approaches and provides a unifying framework. Under this paradigm, all assets can be priced by the relation p = E[mx]. Asset pricing models differ in the specification of the pricing kernel m. One question that arises immediately in asset pricing is the decision to work in discrete or continous time. The discrete time models were developed first, and the have the benefit of a more intuitive feel. Continuous time models have a number of advantages. With a single state variable returns are perfectly instantaneously correlated which simplifies the analysis. More 3
4
CHAPTER 2. ASSET PRICING
generally, moments higher than the second vanish in continuous time. Conditional asset pricing models have become popular in response to the failure of unconditional models. A conditional model can capture timevarying expected returns and/or risk premiums. This chapter develops each theoretical approach, discussing the underlying assumptions and the resulting implications. Derivations are provided for each, and an effort is made to show the connections among the models. The chapter concludes with a summary of the major emprical results and methodologies. We begin the chapter by reviewing portfolio theory.
2.2
Portfolio Theory
Portfolio theory is concerned with the investors’ decision to consume or save and the portfolio selection decision. The theory develops many of the results that appear in the CAPM framework. These results follow from meanvariance mathematics, not from any economic model. Early works were due to Markowitz (1959), who moved the thinking from maximizing E[R] to consideration of both mean and variance. The principle of diversification comes from this work. Under certain conditions, we can consider only the mean and variance of asset returns. One sufficient condition is quadratic utility (see Section 2.3.1). The other sufficient condition is multivariate normality of asset returns. Although either of these assumptions are unlikely to hold, the resulting analysis provides an intuitively appealing framework.
2.2.1
Single Period Optimization Problem
In terms of notation, consider a vector of asset weights w, returns r, expected returns µ, and a variance-covariance matrix Σ. Investors minimize variance, subject to achieving a particular return and the portfolio weights summing to one. 1 L = w0 Σw + λ(µp − w0 µ) + γ(1 − w0 ι) 2 with FOCs Σw = λµ + γι
µp = w 0 µ
1 = w0 ι.
(2.1)
2.2. PORTFOLIO THEORY
5
Solve for w to get w = λΣ−1 µ + γΣ−1 ι.
(2.2)
Frontier portfolios are linear combinations of two portfolios. Premultiply by ι and µ, then define A = µ0 Σ−1 ι, B = µ0 Σ−1 µ, C = ι0 Σ−1 ι, D = BC − A2 . Combine the expression for w and the FOCs to get λ = (Cµp − A)/D and γ = (B − Aµp )/D. This gives wp = g + hµp
(2.3)
where g = BΣ−1 ι − AΣ−1 µ /D and h = CΣ−1 µ − AΣ−1 ι /D. Note that ι0 g = 1, ι0 h = 0, µ0 g = 0, µ0 h = 1
2.2.2
Key Results
From here, we can establish a number of results [see Markowitz (1959) and Roll (1977)]. • The efficient frontier is a hyperbola in µ-σ space. q • Global minimum variance portfolio o is the point ( C1 , CA ). • o is positively correlated with all other minimum variance portfolios and its covariance with these portfolios is its variance, 1/C. • A frontier portfolio p is efficient if µp ≥ CA . • For all frontier portfolios except the minimum variance portfolio, there exists a unique orthogonal frontier portfolio, z with wz0 Σwp = 0. • All portfolios on the efficient frontier are positively correlated. More generally, ρp,j = SRj /SRp where p is on the efficient frontier. (??) • µg = 0, µg+h = 1. • The portfolios g and g + h span the entire frontier. • Any n ≥ 2 frontier portfolios can span the entire frontier. • If wi is efficient then wq = A0 wi is efficient (A diagonal, trace(A) = 1, and Aii ≥ 0 ∀ i). • The covariance between the returns of a frontier portfolio p and any other portfolio n (not necessarily on the frontier) is λµn + γ. • µn = µz + β(µp − µz ), where σp,z = 0 • geometry of tangency lines.
6
CHAPTER 2. ASSET PRICING A beta representation is easy to derive from the FOCs. wp0 Σwp = σp2 = λµp + γ i0p Σwp = σip = λµi + γ z0p Σwp = σzp = λµz + γ = 0
where z is the portfolio orthogonal to p (or rf in the SL model) and i is an portfolio. Since the third equation equals zero, subtract it from the first two equations. σp2 = λµp + γ − λµz ) − γ = λ(µp − µz )
σip = λµi + γ − λµz ) − γ = λ(µi − µz ) Solving for λ and rearranging gives the desired result µi = µz + βip (µp − µz ).
(2.4)
Note that the beta is measured relative to portfolio p which is currently unspecified. The important point of Roll’s critique is that this representation is a mathematical result from the set up of the minimization problem. It does not have any economic content unless we specify p as a particular portfolio. His critique also says that the CAPM is not testable because the market portfolio includes all assets, which we can not measure.
2.2.3
Multiperiod Portfolio Choice
In moving to a multiperiod setting the agent now considers future expected consumption. Time subscripts are for indexing only. Other subscripts denote partial derivatives.
max
{C,w}
T X
Et [U (Ct )] + Et [B(WT )]
t=1
Define the indirect utility function as J(Wt ) ≡ max
{C,w}
T X s=t
Et [U (Cs )] + Et [B(WT )]
2.3. EQUILIBRIUM ASSET PRICING THEORY
7
with J(WT ) = B(WT ). At T − 1, indirect utility is J(WT −1 ) = max U (CT −1 ) + ET −1 [J(WT )] where WT = [WT −1 −CT −1 ] [ are
P
wi (Ri − Rf ) + Rf ]. The first order conditions
UC − ET −1 [BW R∗ ] = 0 and ET −1 [BW (Ri − Rf )] = 0. This generalizes to J(Wτ ) = max U (Cτ ) + Eτ [J(Wτ +1 )] with FOCs UC = Eτ [JW R∗ ] and Eτ [JW (Ri − Rf )] = 0. With log utility optimal consumption depends only on current wealth and not on the investment opportunity set. Consumption is a specific proportion of wealth and investors choose portfolios as in a single period setting by equating the marginal utilities across assets. With power utility, optimal consumption does depend on the investment opportunity set although investment decisions are independent of consumption. With more general HARA utility both consumption and portfolio choice depend on wealth.
2.3
Equilibrium Asset Pricing Theory
The equilibrium approach begins with agents’ preferences and maximizes expected utility subject to budget constraints and market clearing conditions. This approach has the advantage of internal consistency (no arbitrage opportunities) and providing comparative statics. Models may be general equilibrium or partial (e.g., take the riskless rate as given). A disadvantage of these models is that they require taking a stand on preferences, and this often involves a tradeoff between reality and tractability. The standard CAPM is set is a single-period discrete world, whereas the ICAPM and CCAPM are multi-period models in continuous time.
8
2.3.1
CHAPTER 2. ASSET PRICING
Utility Functions
Utility functions are the foundation of equilibrium asset pricing models. Specifying a utility function deternines the features of the agents’ preferences, which in turn affect how assets are priced in the economy. Here we will discuss several important classes of utility functions in a nested framework. Many of the commonly used funtions are special cases of more general specifications. This section also briefly discusses aggregation, representative agents, and the implications for asset pricing. One desirable feature is time-separability of a utility function. This means that an agent’s consumption today does not affect his consumption preferences in the future (no hangovers) u(c0 , . . . , cT ) = Et
"
T X τ =0
β τ uτ (cτ )
#
This is a strong assumption, but it greatly simplifies much of the analysis. Durability is one source of nonseparabilty. Models that relax the separability assumption include habit persistence, “keeping up with the Joneses,” and the Epstein-Zin class of recursive preferences models. Risk-averse agents have utility functions that are concave in wealth (or consumption). In this case, u[E(c)] ≥ E[u(c)] (by Jensen’s Inequality). It is expected utility we care about. Concave utility functions mean the agent is better off with a certain outome than a risky outcome. There are several measures of risk aversion. RA = −u00 /u0 is the ArrowPratt measure of absolute risk aversion, which applies to small risks. The larger is RA , the larger the risk premium required to induce the agent to invest in risk assets. CARA means the agent keeps a fixed dollar amount invested in the risky asset as wealth changes. Models based on CARA preferences do not have income effects. IARA implies the risky asset is an inferior good — those with more wealth take less risk. This doesn’t make sense if one thinks about a subsistence level of wealth. To measure relative risk aversion, we use RR (C) = RA (C)C. This measure describes proportional changes in the risky asset investment for changes in wealth. The wealth elasticity of demand is unity for CRRA utility functions and greater than one for DRRA functions. With CRRA, agents invest a contant proportion of their wealth in the risky assets, whereas with DRRA the fraction of wealth invested in the risky asset increases with initial wealth.
2.3. EQUILIBRIUM ASSET PRICING THEORY
9
Table 2.1: Common Utility Functions: HARA 1−γ u(W ) = γ Case Risk-neutral Quadratic Negative Exp. Power Log
u=
αW +b 1−γ
γ 1 2 −e−αW −∞ W γ /γ <1 log(W ) 0
γ
b
Features
1 0 0
IARA, M-V CARA= α CRRA= 1 − γ CRRA=1
The HARA (hyperbolic absolute risk aversion) family nests many commonly used classes of utility functions. Table 2.1 summarzies the features of common utility functions. With a riskless asset, quadratic or HARA utility implies two-fund separation. If there is not a riskless asset, quadratic or CRRA utility provides this result. With the exception of quadratic (which has its own undesirable properties), restrictions on utility functions alone do not imply mean-variance preferences, so therefore do not imply the CAPM. Equilibrium models rely on the ability to aggregate over individuals in the economy. A complete or effectively complete market guarantees the existance of a representative agent. The representative agent’s utility function is completely determined by individual agents’ preferences and wealths and is independent of available assets only when all investors have HARA utility. The risk aversion of the representative agent is the harmonic mean of individual risk aversions, and will be less than or equal to the wealth-weighted average. It is easier to establish the existence of a representative agent than it is to aggregate demands. In many cases, however, we are interested in the less difficult task of aggregating demand only at the equilibrium price.
2.3.2
CAPM Theory
Assumptions • homogeneous expectations (distinguishes from portfolio theory)
10
CHAPTER 2. ASSET PRICING • • • • •
Quadratic utility or multivariate normality of returns rational, risk-averse investors perfect capital markets unrestricted short selling (Black) borrow and lend at riskless rate (SL)
Derivation of Sharpe-Lintner Model 1 L = w0 Σw + λ[µp − µf − w0 (µ − µf ι)] 2 FOCs: Σw = λ(µ − µf ι) µp − µf = w0 (µ − µf ι) Solving for λ, λ = w0 Σw[w0 (µ − µf ι)]−1 so µ − µf ι = Σw/λ =
Σw (µp − µf ) = β(µp − µf ) w0 Σw
Investors will only hold a combination of the riskfree asset and a tangency portfolio. With homogeneous expectations the portfolio p must be the valueweighted market portfolio M . µ − µf ι = β(µM − µf ) Derivation of Black Model Black’s (1972) CAPM adds one assumption to give the portfolio math results economic content. With investor homogeneity, all investors will hold efficient portfolios. Since the value weighted market portfolio is a linear combination of these efficient portfolios, it too is efficient.We can the rewrite (2.4) as µi = µz + βi (µM − µz ).
2.3. EQUILIBRIUM ASSET PRICING THEORY
11
Alternatively, we can maximize expected return for a given portfolio variance. L = w0 µ + µz (1 − ι0 w) + λ(σ 2 − w0 Σw) gives FOCs: σ 2 = wΣw
1 = 10 w
µ = µz ι + 2λΣw.
So w0 µ = µz + 2λσ 2 2 For the market portfolio 2λ = (µM − µz )/σM . For a generic asset, 2 µi = µz + (µM − µz )σiM /σM = µz + βi (µM − µz ).
Interpretation The assets that covary negatively with the market tend to payoff when the market is doing poorly. These assets are valueable to investors in smoothing their wealth. Since they are valuable, investors will pay a high price and accept a low return. Thus, assets with low or negative betas will have low (or possibly negative) expected returns. Higher risk aversion increases the E[r ]−r risk-return tradeoff. This is measured by the Sharpe-ratio Mσi f , the slope of the CML.
2.3.3
ICAPM Theory
The intertemporal capital asset pricing model and consumption capital asset pricing model extend the standard CAPM intuition to a multi-period setting. The ICAPM replaces dependence on quadratic utility/normal returns with the assumption of a GBM process which implies normally distributed returns. In the continuous time setting, higher moments do not matter, improving tractability of the model. An advantage over the CAPM is utility can be state-dependent, although the time-separability assumption remains. With constant risk tolerance utility functions and constant investment opportunities, optimal portfolio choices are also constant. When the investment opportunity set changes, so will portfolio allocations.
12
CHAPTER 2. ASSET PRICING
Merton’s (1973) ICAPM begins with the specification of asset price paths. Demands are determined by investors’ maximizing current and expected future utility, subject to his budget constraint. Preferences are instantaneously state-independent and depend only on immediate consumption. The indirect utility function, which is the maximized utility of future wealth, is state-dependent. A collection of state variables are sufficient statistics for summarizing the investment opportunities. Investors hedge against adverse changes in the investment opportunity set, with the end goal being a hedge against changes in consumption. Assumptions • limited liability • perfect markets • no restrictions on trading volume/short selling • always in equilibrium • borrow/lend at same rate • continuous-time trading • state variable has continuous sample path • first 2 return moments exist, higher moments unimportant • returns have a compact distribution • time-separable preferences • r˜i = αi dt + σi dzi Under certain conditions, we have two-fund separation and the CAPM: 1. log utility (this means JW x = 0, investors do not want to hedge) 2. σix = 0 ∀ i (no hedge is possible) The following derivation is for a single state variable x. The more gereral case of a vector of state variables is similar. Underlying Processes dW = −Cdt + [W − Cdt]w0 r Et [dW ] = [W w0 α − C]dt var(dW ) = W 2 w0 Σwdt
√ dx = µdt + s˜ εx dt E(dx) = µdt var(dx) = s2 dt
cov(dx, r) = ρix sσi dt = σix dt
2.3. EQUILIBRIUM ASSET PRICING THEORY
13
Optimization Problem J(W, x, t) = max Et
Z
t+dt
U (C, s)ds + J(W + dw, x + dx, t + dt) t
J(W + dw, x + dx, t + dt) = J(W, x, t) + Jt dt + JW dW + Jx dx 1 1 1 + JW W (dW )2 + Jxx (dx)2 + Jtt (dt)2 2 2 2 + JW x dwdX + Jtx dtdX + JtW dtdW + φ (2.5) where φ contains higher-order terms. E[J(·, ·, ·)] = J + Jt dt + JW E[dW ] + Jx E[dx] 1 1 + JW W var(dW ) + Jxx var(dx) + JW x cov(dw, dx) (2.6) 2 2
0 = max [U (C, t) + Jt + JW (−C + W w0 α) + {C,w}
W2 JW W w0 Σw 2
1 + Jx µ + Jxx s2 + JW x W w0 σ ix ]dt (2.7) 2 PN P FOCs: (with portfolio constraint N i=1 wi (αi − rf )) i=0 wi αi = rf + UC = J W
(envelope condition)
W JW (α − rf ι) + W 2 JW W Σw + W JW x σ ix = 0 Now solve for optimal portfolio weights −JW −JW x −1 ∗ w = Σ (α − rf ι) + Σ−1 σ ix W JW W W JW W Define D≡
−JW W JW W
0
−1
ι Σ (α − rf ι)
H≡
−JW x W JW W
ι0 Σ−1 σ ix
(2.8)
14
CHAPTER 2. ASSET PRICING Σ−1 (α − rf ι) t = 0 −1 ι Σ (α − rf ι)
Σ−1 σ ix h = 0 −1 ι Σ σ ix
Therefore w∗ = Dt + Hh. Further, ι0 t = ι0 h = 1 so t and h are portfolios. This gives three-fund separation, with the third fund being the riskless asset. h is the “hedge portfolio,” and has the highest correlation with the state variable x. This set up generalizes with a vector of state variables, in which case we have dim(x) + 2-fund separation. Equilibrium conditions: Define ak = −JW /JW W and bk = −JW x /JW W where k indexes the investor. Rewrite the second FOC as: JW (α − rf ι) + JW W W Σw∗ + JW x σ ix = 0 ak (α − rf ι) = Wk Σwk − bk σ ix P Sum over all investors and divide by k ak :
(α − rf ι) = AΣµ − Bσ ix or (αi − rf ) = Aσim − Bσix P P P P P P where A = k Wk / k ak , B = k bk / k ak , and µ = k wk Wk / k Wk (average investment in each asset across investors). Now multiply by µ0 and h0 to get 2 αm − r = Aσm − Bσmx ,
2 αh − r = Aσhm − Bσhx ,
Solving for A and B and substituting, 2 σix σm σim σhx − σix σmh − σim σmx (αm − r) + 2 (αh − r) αi − r = 2 σm σhx − σmx σmh σm σhx − σmx σmh
= βim (x)(αm − r) + βih (x)(αh − r) The βs have the interpretation of regression coefficients in an IV regression, where x serves as an instrument for h. Note that σ ih
σ ix ΣΣ−1 σ ix = 0 −1 = Σh = 0 −1 ι Σ σ ix ι Σ σ ix
Therefore, σix = kσih . This trick generalizes to cov(j, x) = kcov(j, h) where k = ι0 Σ−1 σ ix . Terms depending on x can be factored from the betas so βim (x) = βim and βih (x) = βih .
2.3. EQUILIBRIUM ASSET PRICING THEORY
2.3.4
15
CCAPM Theory
The CCAPM, due to Breeden (1979), is very much like the ICAPM with consumption growth as the single state variable. In the ICAPM investors hedge against changes in the state variables because these represent changes in the investment opportunity set, and therefore, changes in consumption. The CCAPM goes directly to heding against changes in consumption. The model is also similar to the static CAPM, where end of period wealth mattered. Since the CAPM is one period, end of period wealth is the same as consumption. A key assumption in the CCAPM is additively separable preferences, which gives state independence of direct utility. To make more clear the link between the ICAPM and the CCAPM, note that in the ICAPM agents set the marginal utility of wealth equal to the marginal utility of consumption along the optimal consumption path. This is the envelope condition, UC = JW . If markets are complete, then perfect hedges for the state variables can be formed and all individuals will have perfectly (instantaneously) correlated consumption policies. This is an analogue to all individuals holding the market portfolio in the static CAPM. In many ways, the CCAPM is the most fundamental of the equilibrium models. It is illogical to choose the CAPM or ICAPM because you think the consumption-based model is wrong. The only reason for chosing an alternative model is because the consumption data to test the model may be unsatisfactory. CCAPM Derivation The combination of portfolios h and t which the investor chooses minimize the variance in consumption, not wealth. The CCAPM can be derived as a simple modification to the previous ∗ derivation of the ICAPM. Since UC = JW at the optimum, JW W = UCC CW and JW x = UCC Cx∗ . Substituting into (2.8), ∗
w =
−UC ∗ W UCC CW
−1
Σ (α − rf ι) +
−UCC Cx∗ ∗ W UCC CW
or −UC ∗ (α − rf ι) = W CW Σw∗ + Cx∗ σ ix . UCC
Σ−1 σ ix
16
CHAPTER 2. ASSET PRICING
The covariance between the return on asset i and consumption growth is dC cov r, = E[(αdt + Σdz)(Ct dt + CW dW + Cx dx + φ)] C W CW Cx = Σw + σ ix = σ kiC /C. C C Noting that this is different for each agent k and letting T k = −CUC /UCC k T k (αi − rf ) = σiC .
Summing over all investors we get (αi − rf ) = T −1 σiC . Defining a reference portfolio C, σC2 = wC0 σ iC = T (αC − rf ). Solving for T and substituting, (αi − rf ) =
σiC (αp − rf ) = βiC (αC − rf ). σpC
Note that if the consumption portfolio is not itself a traded asset than the portfolio with the maximum correlation with consumption can be used. The same basic intuition applies, but this results in the same kind of instrumental variable flavor as in the previous presentation of the ICAPM. If consuption is available, it serves as the single variable driving the returns process. When it is not available we include additional state variables to use as instruments.
2.3.5
The CIR Model
? derive a general equilibrium model with endogenous production and stochastic technology shocks. Distribution of production depends on the state variables Y , which are changing randomly. This model fills a void in the literature in that it endogenously determines the equilibrium price path, given the specification of technology. Recall the ICAPM begins with a specification of the price path then determines the equilibrium demand.
2.3. EQUILIBRIUM ASSET PRICING THEORY
17
Assumptions • single physical good • n production activities follow (2.9) • k state variables follow (2.10) • contingent claims for the single good, whose value follows (2.11) • competitive markets • endogenously determined instantaneous borrowing/lending rate r R t0 • fixed number of identical individuals who maximize E t U [C(s), Y (s), s]ds • continous investing and trading with no transactions costs • there exists a unique J and vˆ • (technical) v ∈ V is the class of admissible controls • (technical) J, a∗ and C ∗ are sufficiently differentiable. Underlying Processes n Production Activities dη(t) = Iη α(Y, t)dt + Iη G(Y, t)dw(t)
(2.9)
dY (t) = µ(Y, t)dt + S(Y, t)dw(t)
(2.10)
k State Variables
Value of Contingent Claim i dF i = (F i βi − δi )dt + F i hi dw(t)
(2.11)
Derivation Budget constraint dW =
"
n X i=1
ai W (αi − r) + +
n X i=1
ai W
k X
#
bi W (βi − r) + rW − C dt
i=1 n+k X j=1
gij dwj
!
+
k X
bi W
i=1
or dW = W µ(W )dt + W
n+k X j=1
qj dwj
k X j=1
hij dwj
!
(2.12)
18
CHAPTER 2. ASSET PRICING Let K(v(t), W (t), Y (t), t) ≡ E
W,Y,t
as the differential operator v
L (t)K = µ(W )W KW +
k X i=1
+
k X
W KW Yi
i=1
hR
t0 t
i
U (v(s), Y (s), s)ds and define Lv (t)
n+k
X 1 µ i KY i + W 2 KW W qi2 2 i=1
k k n+k X 1 XX KY i Y j sim sjm (2.13) qj sij + 2 i=1 j=1 m=1 j=1
n+k X
Let the indirect utility function J(W, Y, t) be the solution to max[Lv (t)J + U (v, Y, t)] + Jt = 0. v∈V
J has many of the same properties as U , such as being increasing and strictly concave in W . Defining Ψ = Lv J + U , we get the following necessary and sufficient conditions: • Ψ C = U C − JW ≤ 0 • CΨC = 0 • Ψa = [α − r]W JW + [GG0 a + GH 0 b]W 2 JW W + GS 0 W JW Y ≤ 0 • a 0 Ψa = 0 • Ψb = [β − r]W JW + [HG0 a + HH 0 b]W 2 JW W + HS 0 W JW Y = 0 ˆ a Solving for C, ˆ, ˆb, we obtain Pa PDE for J. The equilibrium satisfies these conditions and markets clear: ai = 1 and bi = 0 ∀ i. Characterizations
The expected rate of return on wealth is a∗0 α. r is the negative of the expected rate of change in the MU wealth, or a∗0 α + the covariance between the rate of return on wealth and the rate of change in the MU of wealth. JW W JW W dW dW ,− =E + cov r = −E JW W W JW The expected rate of return on the ith contingent claim is i (βi − r)F i = [φW φ0Y ][FW FiY ]0
2.3. EQUILIBRIUM ASSET PRICING THEORY
19
where φW =
"
JW W − JW
# k X JW Y i var(W ) + cov(W, Yi ) = (a∗0 α − r)W − J W i=1
and φYi =
"
JW W − JW
k X JW Y i cov(W, Yi ) + − cov(Yi , Yj ) JW j=1
#
Alternatively, we can write βi = r − cov(F i , JW )/F i JW The expected return on a contingent claim is the riskfree rate plus a linear combination of the first partials of the asset price with respect to W and Y . The weights are the φ coefficients, which are much like factor risk premiums in the APT or hegde portfolios in the ICAPM. The φs do not depend on the contingent claim itself and are the same for all claims. If U is not state-dependent, we get a CCAPM-type result, with φW = 00 00 00 − uu0 cov(C ∗ , W ) and φY = − uu0 cov(C ∗ , Y ), giving (βi −r)F i = − uu0 cov(C ∗ , F i ). The expected excess return on an asset is proportional to its covariance with optimal consumption. We can then express relative rates of return in a way that does not depend (explicitly) on preferences. Fundamental Valuation Equation X 1 1 XX var(W )FW W + cov(W, Yi )FW Yi + cov(Yi , Yj )FYi Yj 2 2 " # X X −JW Y −JW W i + F Yi µi − cov(W, Yi ) − cov(Yi , Yj ) JW JW i j
+ [rW − C ∗ ]FW + Ft − rF + δ(W, Y, t) = 0 (2.14)
where r and C ∗ are functions of W, Y, and t. This PDE holds for any contingent claim, with boundary conditions and δ depending on the terms of the claim. The PDE can price assets with payoffs (i) contingent on crossing a barrier, (ii) contingent on not crossing a barrier, and/or (iii) flow payoffs.
20
CHAPTER 2. ASSET PRICING We can focus on the system of equations: dW (t) = [a∗0 αW − C ∗ ]dt + a∗0 GW dw(t) dY (t) = µ(Y, t)dt + S(Y, t)dw(t)
or a second system with a different drift term reflecting a change of measure: dW (t) = [a∗0 αW − C ∗ − φW ]dt + a∗0 GW dw(t) dY (t) = [µ(Y, t) − φ0Y ]dt + S(Y, t)dw(t) The expression
2.4
JW (W (s),Y (s),s) JW (W (t),Y (t),t)
is the conditional pricing kernel.
Arbitrage Asset Pricing
Arbitrage pricing takes a set of basis assets as given and uses them to price other assets.
2.4.1
State Contingent Claims
State contigent claims, or Arrow-Debreu securities, are the building blocks for all assets. These securities pay $1 in a specified state and zero otherwise. Ross (1977b) shows the absence of arbitrage implies the existence of state contingent prices and, therefore, of a linear pricing P operator. This is really just a spanning result. We can write p(x) = s φ(s)x(s). This says the price of security x is the sum over all states of the price of a dollar in each state φ(s) scaled by the size of the payoff in each state x(s). Harrison and Kreps (1979) extend this to show that this operator can be represented as an expectation with respect to a martingale measure. Let D denote an (n × n) matrix of asset payoffs with typical element dij , where i denotes the state and j the security. This matrix is a colection of vectors dj of asset payoffs. α is an n-vector of weights, b an n-vector of payoffs. φ is the price vector for the n Arrow-Debreu securities and p the prices of the complex securities. We have the following pricing relations D0 φ = p and Dα = b 1 = pf , (1 + rf )ι0 φ = ι0 π = 1. π = f (θ, λ) is the risk-neutral with ι0 φ = 1+r f probablities, a function of the true probabilties θ and risk aversion λ.
2.4. ARBITRAGE ASSET PRICING
2.4.2
21
Arbitrage Pricing Theory
The APT, originally developed by Ross (1976), has generated a tremendous literature of theoretical extensions and a wide range of empirical tests. The intuition is simple. Assume returns follow a factor-model, meaning returns depend on the realization of factors and (quasi-) orthogonal shocks.1 The factors are not diversifiable, whereas the orthogonal shocks are in some sense. The theory is silent on what the factors are, or even the number of factors. A key idea is the factor-mimicking portfolio. There are really three different cases of the APT, depending on the assumptions about the structure of the Ω matrix of “idiosyncratic” covariances. If we have an exact or noiseless factor model, then Ω is the zero matrix and an exact arbitrage argument will hold. Alternatively, we could have a strict factor model in which the matrix is diagonal so there is no correlation across assets. Large diversified portfolios cause the idiosyncratic variance to go to zero. We appeal to an asymptotic arbitrage argument in which there is no arbitrage on average, although specific securities may be mispriced. Finally, we could allow for a more general correlation structure where Ω may contain non-zero off-diagonal elements. This approximate factor model allows for idiosyncratic correlations (e.g., industries) and requires restrictions on the covariaces of returns such that the idiosyncratic part is diversified away while the factors remain. The controversy over the structure of Ω has major implications for the testability of the model. The APT has a flavor very similar to the ICAPM, although it is arises from a different viewpoint. In the end, both models specify expected returns as a function of a linear combination of their covariances with variables (factors and state variables, respectively). This link arises because it is implied by the absence of arbitrage. The additional assumptions in the equilibrium model serve to determine the risk premium associated with each state variable.2 The model has been extended in a number of other ways including dynamic, conditional, nonlinear, international versions. Tests of the model have also followed several paths, broadly categorized as cross-sectional or time se1
By quasi-orthogonal shocks I mean that some correlation among the reisduals is allowed. 2 Actually, models such as the CAPM are partial equilibrium models and take the riskless rate and market price of risk as given. Richer models such as CIR introduce production uncertainty and are able to more completely characterize the economy.
22
CHAPTER 2. ASSET PRICING
ries. In general, tests reject the model but find it provides more favorable performance than models like the CAPM. APT Derivation This derivation is based on the strict factor version. The exact APT derivation will also work under this approach. Modifications for the approximate APT are mentioned at the end. It is very important to understand that the APT starts with a characterization of realized returns r, and uses statistical properties to say something about expected returns µ. rt = µt + ν t = µt + Bft + ut
E[rt ] = µt
ft ∼ N (0, I)
(2.15)
ut ∼ N (0, Ω)
where Ω is diagonal. ft is a factor vector and B a loading matrix, which together give the unexpected factor-related return. Return covariances are E[rt r0t ] = Bff 0 B0 + Ω = BB0 + Ω = Ψ. As an aside, define Φ such that ΦΦ0 = I, giving B = DΦ, a rotation. Therefore, Ψ = D0 D + Ω, illustrating the rotational indeterminancy. Next form a portfolio with weights w. The portfolio variance is σp2 = w0 Ψw = w0 BB0 w + w0 Ωw ≈ w0 BB0 w. The strategy is to choose w such that w 0 BB0 w = 0 without making an investment, ι0 w = 0. To find a w think of this as a regression of µ on [ι B]. This is µ = λ0 ι + Bλ + w.
(2.16)
The normal equations from the regression give ι0 w = 0 and B0 w = 0, which implies w0 BB0 w = 0 as desired. To find w0 r, insert (2.16) into (2.15) to get rp = w0 (λ0 ι + Bλ + w) + w0 Bf + w0 u.
2.5. PRICING KERNEL APPROACH
23
Taking expectations and using the orthogonality conditions, µp = w0 w. This validates (2.16), which can be written as µt ≈ λ0 ι + Bλ.
(2.17)
If a factor is negatively correlated with the IMRS the model implies a positive risk premium. Using wN in (2.16), where N indexes the number of assets, a sequence of arbitrage portfolios satisfies the Ross pricing bound if w N 0 wN does not go to infinity with N . The approximate factor model is derived by requiring that as N → ∞ the smallest eigenvalue of B0 B → ∞ while the largest eigenvalue of Ω → 0. That is, the factors are pervasive while the idiosyncratic part is diversifiable.
2.5
Pricing Kernel Approach
The pricing kernel approch is in many ways a hybrid of the equilibrium and arbitrage approaches. The focus is to specify the pricing kernel3 m which makes the Euler equation hold: pt = Et [mt+τ xt+τ ]
(2.18)
This seemingly simple expression is complex enough to cover pricing for any asset. The expression can be modified to handle returns, excess returns, stocks, bonds, options, etc. The meaning of the payoff x and the price change, but the same intuition applies. The expected return on an asset is negatively related to its covariance with the stochastic discount factor. Assets whose returns vary positively with the sdf pay off when the marginal utility is high. That is, they provide wealth in the states when it is most valuable to investors. Consequently, investors are willing to pay high prices and accept low returns for these assets. There are basically two ways of doing business. One is to take the IMRS as given and interpret (2.18) as the Euler equation arising from the consumer’s 3
This object lives by many names, including the stochastic discount factor (sdf), intertemporal marginal rate of substitution (IMRS), or benchmark pricing variable. It is incorrectly referred to as the Radon-Nikodym derivative, Arrow-Debreu price, or statecontingent claims price (unless the riskless rate is zero). While on naming conventions, the risk-neutral probability measure is also referred to as the equivalent martingale measure (EMM).
24
CHAPTER 2. ASSET PRICING
optimization problem. The goal would then be to explain asset returns. The other view is to take the returns as given and explore the implications for m. The characteristics of m depend upon the structure of the economy. If the law of one price (LOP) is satisfied, there will exist (at least one) m such that (2.18) holds. In the absence of arbitrage (NA), m is strictly positive. If markets are complete then m is unique.
2.5.1
Basics
This presentation is for a discrete time, multiperiod model. Define the consumption set c ∈ B(ei , p) ⊂ R × X. The budget constraints are c(0) = e(0) − θ 0 p and c(T, ω) = e(T, ω) − θ 0 d(ω). Combining these two equations, ˆ =ˆ ˆ =ˆ Dθ c−ˆ e. The attainable set Dθ c ignores the initial endowment. I will abuse notation and consistency by letting Q and π ∗ refer to the EMM. The later is more appropriate for discrete settings. Also, dividend (payoff) vectors and matrices are indicated by d and D. Definition 1 The market is complete iff every consumption process is attainable (M = X), or iff rank(D) = k. Definition 2 An arbitrage strategy has non-negative, non-zero consumption ˆ ≥+ 0 with e(0) = (0); Dθ Definition 3 An Equivalent Martingale Measure Q (or π ∗ ) satisfies p = D0 π ∗ /Rf . Q exists iff there is no arbitrage, or iff an equilibrium exists. If markets are complete then Q is unique. Definition 4 A price functional Φ : R × M → R (Π : M → R) satisfies Φ(c) = c(0) + Π(c(T )) = c(0) + θ 0 p for any θ such that c(T ) = θ 0 d. This implies B(e, p) can be expressed as Φ(nc) = 0 where nc(t) ≡ c(t) − e(t) ∈ M . Π is unique even in an incomplete market and exists is there is an equilibrium. A price system is viable: iff there is no arbitrage, iff Q exists, or iff Φ (or Π) exists. Definition 5 Ψ : X → R is an extension of Π if for all x ∈ M, Ψ(x) = Π(c). A sequence of scaled prices is a Q-martingale.
2.5. PRICING KERNEL APPROACH
2.5.2
25
Different Expectations
Denote the price of asset x, a package of state-contingent claims, as p(x). Then X X φ(s) p(x) = φ(s)x(s) = π(s) x(s) = E P [mx] π(s) s s where π(s) is the (true) probability of state s. It follows then that m(s) = φ(s)/π(s). To move to risk-neutral probabilities π ∗ , define π ∗ (s) ≡ Rf m(s)π(s) = Rf φ(s), where 1/Rf =
P
φ(s) = E[m]. Then
p(x) =
X s
φ(s)x(s) =
X π ∗ (s) s
Rf
x(s) =
E Q [x] . Rf
These results imply p(x) = E Q [x]/Rf = E P [mx]. Stated differently π ∗ (s) = Rf m(s). π(s) The risk neutral probabilities give greater weight to states with high marginal utility, the “bad” states. In discrete time, the “change of measure” is π ∗ π = Rf m =
Q P
In continuous time the analagous expression is dQ f Q (x1 , . . . , xn ) = lim nP dP n→∞ fn (x1 , . . . , xn ) where fn () represents the joint likelihood under the respective measure. This expression is the Radon-Nikodym derivative, and is the limit of the likelihood ratios. This random variable satisfies dQ Q P xT . E (xT ) = E dP
26
CHAPTER 2. ASSET PRICING
2.5.3
Asset Pricing with m
This analysis is useful in pricing assets. For a collection of assets in an economy, the price is the risk-neutral expectation of the future value, discounted back to the present at the riskless rate p = D0 π ∗ /Rf . If the market is complete, Q is unique (π ∗ is identifable in a discrete setting) and we can invert the payoff matrix to solve for the probabilities π ∗ = Rf (D0 )−1 p. If the market is not complete it is often possible to get a range of admissable EMMs. Further restrictions may result from imposing the NA condition that the pricing kernel be positive. Recall that dividing by the riskless rate will give the Arrow-Debreu prices φ = π ∗ /Rf = (D0 )−1 p. Furthermore, the pricing kernel is m = pf π ∗ π = (D0 )−1 p π. Once the EMM or pricing kernel are known they can be used to price any other asset.
2.5.4
The Agent’s Problem
There is a relationship between the pricing kernel and equilibrium approaches. The agent will X X X max u(c) + βπ(s)u[c(s)] s.t. c + φ(s)c(s) = y + φ(s)y(s). {c,c(s)}
s
s
FOCs are u0 (c) = λ
βπ(s)u0 [c(s)] = λφ(s)
Solving, φ(s) = βπ(s)
u0 [c(s)] u0 (c)
s
2.5. PRICING KERNEL APPROACH
27
or m(s) =
u0 [c(s)] φ(s) =β 0 . π(s) u (c)
Thus m(s1 )/m(s2 ) = u0 [c(s1 )]/u0 [c(s2 )], so m gives the marginal rate of substitution between date and state contingent claims. In equilibrium, marginal utility growth should be the same for all consumers u0 (ci,t+1 ) u0 (cj,t+1 ) βi 0 = βj 0 . u (ci,t ) u (cj,t ) Hence m is referred to as the IMRS. Taking the expectation of either m or IMRS gives the price of a riskless bond.
2.5.5
The Main Results
Using the definition of covariance and (2.18) 1 = E[mR] = E[m]E[R] + cov(m, R)
E[R] =
cov(m, R) 1 − E[m] E[m]
(2.19)
(2.20)
It follows immediately that if there is a riskless asset Rf = 1/E[m], or pf = E[m]. Without a riskless asset, we can view 1/E[m] as a “shadow” riskfree rate, or a zero beta return. Note that the expectations have been under the true probability measure P. Using the above results, cov(m, R) var(m) E[Ri ] = Rf + − = Rf + βi,m λm var(m) E[m] which is a beta pricing model. Relation between m, β models, and MV frontier • p = E[mx] ⇒ β: m, x∗ , or R∗ can serve as reference variables. If m = b0 f , then f , proj(f |X), or proj(f |R) can be used. • p = E[mx] ⇒ mean-variance frontier which includes R∗ • β ⇒ p = E[mx]: m = b0 f
28
CHAPTER 2. ASSET PRICING
Table 2.2: Common Pricing Kernels
Model CAPM ICAPM CCAPM APT Black-Scholes
mt+1 a + bRW,t+1 P a+ K k=1 bk fk,t+1 u0 (ct+1 ) β u0 (ct ) b0 f exp[−(r + 21 σ 2 )τ + σdZ]
• MV frontier ⇒ p = E[mx]: m = a + bRmv • MV frontier ⇒ β model with Rmv as a reference variable. Since mean-variance efficiency implies a single beta representation, some single beta representation can always be found. The asset pricing model says that a particular portfolio (e.g., the market) will be mean-variance efficient. In other words, the content of a model comes from m = f (·), not p = E[mx]. Also, given any multi-factor or multi-beta representation, we can always find a single beta representation. The relationship between the ICAPM and CCAPM is an example of this. m as a Portfolio The portfolio that maximizes squared correlation with m is a minimum variance portfolio. m∗ , the projection, also prices assets and can replace m. p = E[mx] = E[(m∗ + ε)x] = E[m∗ x]
2.5.6
Hansen-Jagannathan Bounds
The Hansen and Jagannathan (1991) bounds are an important addition to asset pricing. Instead of a binary reject/fail to reject result, the HJ bounds offer some insights as to why the model may be rejected. The model is most useful for testing models like the consumption model where m is explicitly specified. The model is useless for evaluating factor models that do not specify the factors since there are always some factor-mimicking portfolios that will work ex post.
2.6. CONDITIONING INFORMATION
29
Working with excess returns, E[mr e ] = 0, so E[m]E[rie ] = −cov(m, ri ) = ρmri σm σri . Since |ρ| ≤ 1, σm E[ri ] ≥ E[m] σr i
(2.21)
where r ∗ represents the return with the maximum Sharpe ratio. This holds for any asset i, including the one with the maximum Sharpe ratio. To be clear, the maximal Sharpe ratio measure the excess return on the tangency portfolio r ∗ relative to its standard deviation (assuming a one-factor world). Both the excess return on the tangent portfolio and the SR depend on Rf . Rewriting as σm = E[m]SR, the H-J bound is a function of E[m]. As we change E[m], we get a new Rf , a new tangency portfolio, and a new Sharpe ratio. Plotting σm as a function of E[m] gives us the locus of points comprising the H-J bound. Note that if we know Rf , the the bound is just a point. These results are based on the law of one price (LOP), and do not use the no arbitrage (NA) restricition that m > 0. By imposing the NA restriction we can sharpen the bound given in (2.21). The NA bound is very similar to the LOP bound for moderate values of E[m], but as E[m] becomes more extreme (higher SR), the NA bound is much stricter (higher). For payoffs x and Lagrange multipliers λ and δ, m+ = [λ + δ 0 x]+ subject to E[m+ ] = E[m] and E[m+ x] = p. This nonlinear problem can generally be solved numerically. m+ has the interpretation of a call option with zero strike price on a portfolio of payoffs [1x]0 . The H-J bound analysis has been extended in several ways. Snow (1991) generalizes the model to include any moment of m. In this setting the bounds are more sensitive to outliers. Other extensions include incorporating transactions costs, utilizing cross-moments, and analyzing pricing errors as a way to detect specification errors. One example is adding different sets of assets and seeing how much the bound shifts up.
2.6
Conditioning Information
The difference between a conditional and unconditional model is the information set used. If payoffs and discount factors (and therefore, prices) are
30
CHAPTER 2. ASSET PRICING
iid, then conditional and unconditional models are the same. Define UMV iff E[Rp2∗ ] ≤ E[Rp2 ] ∀ Rp s.t. E[Rp∗ ] = E[Rp ] CMV iff Et [Rp2∗ ] ≤ Et [Rp2 ] ∀ Rp s.t. Et [Rp∗ ] = Et [Rp ]
By iterated expectations, this gives UMV ⊆ CMV. If a portfolio is UMV it must be CMV, but the converse need not be true. We can also consider the set of minimum variance portfolios conditional on Z, CMVZ . Then CMV includes CMVZ , which in turn includes UMV. A conditional factor pricing model does not imply an unconditional model. An unconditional model does imply a conditional model. From here we can say that it is possible to reject that a portfolio is UMV or CMVZ , but we can not reject CMV since the information set for CMV is unobservable. This is similar to the issue raised by Roll (1977); rejecting UMV does not imply rejection of CMV. Cochrane (1998) refers to this as the Hansen and Richard (1987) critique. The use of scaled factors (i.e., scaled by instruments in the proper information set) is a partial solution. If the test is based on 1 = E[mR] for some particular m, then it is possible to test without the complete information set. Recall m∗ can replace m in (2.18), so m∗ is also CMV and is a function of the unobserved information set. The use of conditional models allows for time-varying expected returns. This time variation can arises due to changes in the risk premium or because of conditional covariances (β changes through time). The ARCH-GARCH family of models is often used to capture the time series behavior of conditional moments.
2.7
Market Efficiency
Examining the link between the theoretical asset pricing models and empirical tests requires a position on market efficiency. The general idea behind market efficiency is that prices reflect available information. Of course a more precise definition of available information and the implications of reflecting this information are necessary. The early view of market efficiency was the random walk. In this model the series of innovations is independent. Empirical evidence during this period found that prices are consistent with a random walk. The apparant implications of this model are that prices are not driven by supply/demand and
2.8. EMPIRICAL ASSET PRICING
31
there is no point in fundamental analysis. In fact, the random walk does not have these implications since slowly adjusting prices would allow profitable trading strategies. A problem with the random walk is that it simulatneosly requires rational investors to eliminate profitable trading opportunites, but also assumes investors irrationally pay for security analysis. The martingale model was proposed as an alternative to the random walk by Samuelson in the mid-1960s. A random variable xt+1 is a martingale with respect to an information set Φt if E[xt+1 ] = xt . A fair game has the property that E[yt+1 ] = 0. Returns are a fair game if prices and dividends follow a martingale. Finding a variable that can predict returns means either that returns are not a martingale or that that variable in not in the information set. More recent versions of market efficiency also assume rational expectations. The martingale will hold when investors have common, constant rate of time preferences, homogeneous beliefs, and are risk-neutral. Note that risk neutrality implies a martingale, but does not imply a random walk. The reason is that a martingale allows dependence of higher moments on the information set, whereas the random walk does not. Allowing for risk aversion does not go very far in reconciling the martingale model with the data. There are several reasons not to base market efficiency on the martingale model. In a setting such as the ICAPM, conditional expected returns depend on dividends. Since dividends are autocorrelated the conditional expected returns are partially forecastable in violation of the martingale model. Time variation in the risk premium may also lead to failure of the martingale model. Finally, most emprircal tests have a joint hypothesis problem. Rejecting a model may mean either the model is wrong or the market is inefficient.
2.8
Empirical Asset Pricing
2.8.1
Properties of Asset Returns
Normality offers nice features in modeling asset prices, however departures from normality have been extensively documented. Relative to the normal distribution, asset returns exhibit skewness and kurtosis. Matters are complicated further by serial correlation in returns.
32
CHAPTER 2. ASSET PRICING
Table 2.3: Patterns In Returns Factor Size B/M E/P CF/P 1/P T-bills Dividend Yield Term structure slope Expected Inflation Credit quality January Monday Contrarian
Relation – + + + + – + – + + –
Comment Banz (1981) Basu (1977)
also related to volatility
?, Jegadeesh and Titman (1993)
Momentum
Cross-sectional Patterns There is evidence that lagged variables are useful in predicting stock and bond returns. Many of the results documented in the U.S. are also present in other countries. Table 2.3 provides an overview of these patterns. Interpretation of these patterns are difficult since many of these variables are highly correlated, and much of the relation each has with returns comes in January. At longer time horizons some of the effects, such as size and E/P, tend to reverse themselves. A common criticism is that these variables may be correlated with the true β when estimates of β are noisy. Chan & Chen () show that average size and estimated beta in size-sorted portfolios are almost perfectly negatively correlated. Another issue that arises in interpretation of the cross-sectional regularities is whether they are all capturing the same underlying phenomenon. This is especially likely considering price is in many of the variables. Attempts to disentangle the effects are inconclusive. Some researchers
2.8. EMPIRICAL ASSET PRICING
33
claim size subsumes E/P, while others claim the opposite. Fama and French (1992) claim that size and B/M together subsume E/P (and beta). Given the way these tests are designed, the B/M variable may actually be a proxy for the true beta. A stock that recently declined in price will have a high B/M. This stock is also likely to be more levered than before its decline, so it is now riskier and should have a higher beta. However the beta estimate is generally based on returns several years prior, so the recent downturn is likely to be washed out. In the end, the estimated beta may be too low, and the high B/M may capture the added risk of the stock. Alternatively, the B/M results may be due to survivorship biases in the COMPUSTAT tapes. There are several calendar related patterns in returns. Most famous is the January effect, where returns are much larger in January than in other months. Possible explanations include tax-based trading, window dressing by institutions, and liquidity trading. The January effect is most pronounced for small firms. The weekend effect describes the large negative returns from Friday close to Monday close. It is not clear that all the abnormal return is due to the weekend period, but Monday returns alone do not seem to account for the entire effect. International evidence is mixed with respect to weekly patterns, but many of the Asian markets have a Tuesday effect, which corresponds to Monday trading in the U.S. There is some evidence that most of the returns each month occur during the first two weeks. This may be due to portfolio rebalancing caused by month-end salaries. Finally, there is a holiday effect, where one third of the annual returns occur on the trading days preceeding the eight holidays on which the market is closed.4 In a clever paper Berk (1995) addresses the fact that price is directly related to size. The basic logic is very simple — risky firms will be discounted at a higher rate, therefore current market values will be smaller. This will give the appearance that small firms have higher returns, even though firm size (future cashflows) and risk may be unrelated. Consider a set of firms with log future cash flows c, log price p, and log return r = c − p. Further assume size and risk are independent. Now regress returns on beginning of period size r = α 1 + β1 p + ε 1 . 4
This is misleading since positive and negative returns cancel out.
34
CHAPTER 2. ASSET PRICING
The sign of β1 depends on the covariance between r and p cov(r, p) = cov(c − r, r) = cov(c, r) − var(r) = −var(r) < 0. Thus we should expect a negative relation between firm size and returns. Now consider a regression of actual returns on expected (model) returns rˆ r = α2 + β2 rˆ + ε2 . Take the pricing errors ε2 and regress them on current size ε2 = α 3 + β3 p + ε 3 . The sign of this regression coefficient depends on the covariance cov(p, ε2 ) = cov(c − r, ε2 ) = cov(c, r − α2 − β2 rˆ) − cov(α2 + β2 rˆ + ε2 , ε2 ) = −var(ε2 ) < 0. This shows that size is negatively related to pricing errors. How much of the variation in actual returns is explained by size? Decompose the R 2 from the first regression 2 cov(r, p) var(p) var(β1 p) 2 var(p) 2 = β1 = R = var(r) var(r) var(p) var(r) var(r) var(r) var(r) = = = var(p) var(c − r) var(r) + var(c)
The larger the variation in cashflows the lower is the R2 . The basic conclusion of the article is that market value will end up capturing unmeasured/unmodeled risks. Time Series Patterns
Asset returns contain patterns in autocorrelations summarized in Table 2.4. Using CRSP stock returns from 1962–1994, portfolio autocorrelations range from 1.3% to 43.1%. Autocorrelations increase with shorter time horizons and are higher in equally-weighted portfolios than value-weighted portfolios. Both of these effects are likely due to higher autocorrelation in smaller stocks, which may be due to non-synchronous trading. There is weak evidence of negative autocorrelations in multi-year returns. In most cases the economic significance of the autocorrelations may be small, as is the proportion of the total variance explained. Individual stocks, especially smaller ones, tend to have negative autocorrelation.
2.8. EMPIRICAL ASSET PRICING
35
Table 2.4: Correlation Patterns Horizon Daily Weekly Monthly Annual Multi-year
Individual – – –
Portfolio + + + –
Variance Ratios The random walk hypothesis implies the variance of asset returns scales with time; a T -period return should have a variance T times as large as a oneperiod return. A similar statistic can be derived using variance differences. Finite sample properties can be significantly improved by using overlapping observations and making appropriate degrees of freedom adjustments. Positive autocorrelations suggest variance ratios greater than one. For the equally-weighted portfolios, this seems to be the case, with V R(2) ≈ 1.2, and increasing with longer-horizons. V R(16) ranges from 1.5 to 1.9, depending on the time period (this effect is getting smaller as time goes on). These results disappear in value-weighted portfolios. Looking at size-sorted portfolios, the variance ratios are largest for the small-stock portfolios and are close to one for the stocks in the largest decile. For individual securities the variance ratios are close to one in general, and less than one for the longer horizons. This is because there is some negative autocorrelation in individual security returns due to the bid-ask spread. The combination of negative autocorrelation in individual securties and positive autocorrelation in portfolios gives rise to positive cross-autocorrelations. This phenomena can be summarized as a stronger correlation between current small-stock returns and lagged large-stock returns than between current large-stock returns and lagged small-stock returns. More directly, large stocks tend to lead smaller stocks. This can help explain the apparant profitability of contrarian strategies.
36
CHAPTER 2. ASSET PRICING
Long-Horizon Returns Shiller () and Summers () present models where stock prices have fads or bubbles, causing large slowly decaying swings from fundamental values. Shorter horizon portfolio returns have little autocorrelation, while returns at longer horizons have strong negative autocorrelation. Empirical evidence supports these models, although the tests are based on small sample sizes and lack power. Other empirical results indicate that the variance grows more slowly than the time horizon, also consistent with the model. A general problem is that that irrational bubbles in stock prices are not distinguishable from rational time-varying expected returns. Long-horizon returns are also predictable with other variables such as D/P and E/P. These variables can explain roughly a quarter of the variation in two to four year returns, much more than is possible for shorter horizons. ? propose their contrarian viewpoint, where buying losers and selling winners (measured over 3 to 5 year periods) produces excess returns. Others have argued that the excess returns are due to differences in risk, although a rebuttal paper from DeBondt and Thaler disagrees. It is possible that the contratrian results are due to a size effect or some type of distressed-firm effect.
2.8.2
General Procedures
Multivariate tests can elimintate the errors-in-variables problem and increase the precision of parameter estimates. This type of test still does not say why the model is rejected. Consider a multi-beta model of the form Et [Ri,t+1 ] = λ0,t +
K X
βi,j,t λj,t .
j=1
To test this using a multivariate regression Ri,t+1 = αi +
K X
βi,j Rj,t+1 + εi,t+1
(2.22)
j=1
P the intercept restriction is αi = λ0 (1 − βi,j ). This is equivalent to meanvariance intersection, meaning that the minimum variance boundaries of all the asset returns and minimum variance portfolios intersect at a single point.
2.8. EMPIRICAL ASSET PRICING
37
In other words, a combination of mimicking portfolios lies on the meanvariance frontier. The multivariate regression in the restricted form uses T N observations to estimate N + 1 parameters. The unrestricted model has 2N parameters to estimate. Tests with longer time series have more power, while those with more assets have a larger size. The restrictions can be tested with the Wald (W), likelihood ratio (LR), or Lagrange multiplier (LM) statistics. These are all asymptotically χ2 but may differ in finite samples.
2.8.3
CAPM Tests
The only testable implications of the CAPM are that the market is meanvariance efficient, and for the SL model that the intercept is zero. Roll (1977) indicates that this is inherently impossible to do since the market is unobservable. “Rejecting” the model may simply mean that the proxy is not mean variance efficient. Converesely, “failing to reject” may mean that the proxy is mean variance efficient. In either case, we have not said anything about the mean-variance efficiency of the market. Further, there are always some portfolios which are mean-variance efficient. There is also the issue with conditioning information. The CAPM can hold conditionally but fail unconditionally. Without knowing what conditioning information to use, the models are difficult to test. Stambaugh (1982) examines the sensitivity to excluded assets in the market proxy, finding inferences are similar regardless of the specific composition of the proxy. Kandel and Stambaugh (1987) and Shanken (1987) estimate the upper bound on the correlation between the proxy and the true market needed to overturn rejection of the model. As long as the correlation is at least 0.70, inferences would not change. Roll and Ross (1994) counter by saying that if the true market portfolio is efficient, cross-sectional relations between expected return and beta are very sensitive to the proxy choice. As in any statistical test, there is a tradeoff between size and power. Adding assets tends to increase the size of a test in finite samples. A longer time series can considerably increase the power of a test. GMM tests have become popular since they do not rely on normality, homoskedasticity, or uncorrelatedness. The early evidence was generally supportive of the CAPM, in that the evidence seemed consistent with mean-variance efficiency of the “market” portfolio. Representative studies include Fama and MacBeth (1973), Black,
38
CHAPTER 2. ASSET PRICING
Jensen, and Scholes (1972), and Blume and Friend (1973). In the mid-1970’s the “anomalies” literature developed [see Fama (1991) for a review]. Common criticisms of these “anomolies” are sample selection and data snooping biases. Kothari, Shanken, and Sloan (1995) claim that sample selection biases drive the results of Fama and French (1992), although Fama and French (1996b) dispute this claim. Fama-MacBeth (1973) FM perform introduce what has become a classic methodology for empirical asset pricing tests. They test the Black and SL CAPMs using monthly portfolio returns and the equally-weighted NYSE as the market. Their tests examine (i) the linearity of the risk-return tradeoff, (ii) if variables other than β matter, (iii) if the risk premium is positive, and (iv) if the return on the zero-beta portfolio is equal to the riskless rate. The procedure is as follows. First, portfolios are formed using estimated β of individual securities over a four year period. Since measurement error will systematically affect these portfolios, the betas are reestimated over a five year period and averaged across assets to get portfolio β. The β for each portfolio is recalculated each month over the next four years to cover delistings. Returns for each of the 20 portfolios are regressed on the portfolio betas. This is repeated each month, and the estimated coefficients are averaged over time. The results are generally supportive of the Black model but the estimated riskless rate is higher than the market rate. Additional regressions including βˆ2 and the asset-specific risk indicate that the risk-return relation is linear and there is no reward for bearing unsystematic risk. Extensions by Litzenberger and Ramaswamy (1979) and Shanken (1992) explicitly adjust standard errors for the EIV bias rather than form portfolios. Shanken (1992) shows that the standard errors in Fama and MacBeth (1973) do not properly reflect measurement error in β, overstating the precision of the risk premium estimates. Black, Jensen & Scholes (1972) Fama-French (1992) The controversial Fama and French (1992) paper has generated a significant debate in the literature. The general goal of the paper is to assess the relative
2.8. EMPIRICAL ASSET PRICING
39
importance of beta, size, B/M, leverage, and E/P in determining the crosssection of expected returns. These variables had been previously documented as important in the “anomalies” literature. Their general findings are that beta is not systematically related to returns, while size and B/M subsume the other factors. The methodology employed is basically an extension of the Fama and MacBeth (1973) procedure. The new steps involve the combination of accounting and market data. All accounting data for the fiscal year ending t − 1 is combined with returns measured from July of year t to June of t + 1. Stock price data used to construct accounting ratios is from the beginning of year t, while the size measure is from June of year t. This procedure ensures all explanatory variables are known prior to the return. In order to preserve the firm-specific accounting information, portfolios are not used in the same way as in FM. Instead, portfolios are used to calculate betas, which are then assigned to all firms in that portfolio. The portfolios are formed by first forming size deciles, then forming beta deciles within each size decile. In both sorts, breakpoints are set based on only the NYSE firms. With these 100 portfolios, portfolio betas are calculated each as the sum of the coefficient on current and prior month CRSP value-weighted retutns. The beta for a particular stock can change over time as the stock moves into different portfolios. This two-way sorting procedure produces variation in beta that is unrelated to size. Univariate statistics show that average returns are related to size, but unrelated to beta. This evidence is confirmed by the FM regressions. Gibbons (1982) Gibbons (1982) introduces a multivariate test of the CAPM and rejects CAPM soundly using LR. He uses the CRSP equally-weighted index as the market, estimates β over a 5 year period, and forms 40 portfolios. This mutivariate methodology avoids the EIV problem, provides more precise risk premium estimates, and has more power than previous tests. The nonlinear restriction on the intercept is linearized with a Taylor-series expansion. Stambaugh (1982) Stambaugh (1982) shows inferences are not sensitive to proxy choice, but are sensitive to the asset choice. He argues that W lacks power, LR has
40
CHAPTER 2. ASSET PRICING
the wrong size, and LM is closest to its asymptotic distribution. Using a portfolio of stocks, bonds, and preferred, he fails to reject linearity (Black CAPM), but rejects SL. Using fewer assets he rejects both models. Shanken (1985) Shanken (1985) provides the asymptotic results for the multivariate tests in Gibbons (1982). He shows that LM < LR < Q∗ (= W ). These statistics are all transformations of one another. Shanken uses QA C , which includes considerations for sample size and degrees of freedom adjustments. Recalculating Gibbons’ LR statistic, Shanken shows p = 0.75, so the rejection inference is overturned. The cross-section regression test (CSRT) used in this paper does not require specifying HA . The procedure estimates beta in a first stage, then using betas in cross-sectional regressions. The CAPM is rejected using the equally-weighted CRSP index. MacKinlay (1987) MacKinlay (1987) discusses power of multivariate SL CAPM tests. Finds that tests against an unspecified alternative have low power. The type of deviation from the model is important in determining power. These tests have reasonable power against cross-sectional random deviations. However, these tests have low power against omitted factors. He rejects in some subperiods but fails to reject overall.
2.8.4
ICAPM/CCAPM Tests
Tests of a multi-beta model are similar to CAPM tests in that they are really tests of the mean-varance efficiency of a particular combination of portfolios. There is mixed evidence about the importance of durable goods. Habit persistence models perform better in goodness-of-fit tests, but still do not explain the first moment of the equity premium puzzle. Hansen Singleton Reject model. See QM notes for more details.
2.8. EMPIRICAL ASSET PRICING
41
Mehra Prescott (1985) The equity premium puzzle arises because extreme risk aversion parameters are needed to make the low volatility of aggregate consumption growth in the U.S. consistent with the returns on both equity and T-bills. Some of these results may arise partially because of poorly measured consumption data, but efforts to correct for this still lead to rejections of the model. One possible (partial) explanation for the equity premium puzzle is incomplete markets, which may result in the overestimation of risk aversion. One experiment using log utility (CRRA = 1) results in an estimate based on aggregate consumption of CRRA = 3. Weil () presents the same puzzle from the perspective of the riskless asset.
2.8.5
APT Tests
The testable implications of the APT given in (2.17) are 1. λi 6= 0 for any i 2. λ0 (= rf ) ≥ 0 (debated) 3. linearity Again, the test really amounts to seeing if a particular combination of portfoliosis mean-varance efficienct. To make the intertemporal APT testable, certain restricitons need to be imposed. One alternative is to assume that (i) the observed set of assets has a factor structure, (ii) the noise terms of the observed assets are uncorrelated with the noise terms on the unobserved assets, and (iii) the factors span the state variables. Alternatively, we can assume logarithmic utility in which case the intertemporal APT reduces to the APT. These requirements are very similar to the ICAPM. As mentioned in Section 2.4.2, the APT has features which make testing difficult. In fact, one view is that APT is not testable [e.g., Shanken (1982), Reisman (1992)], whereas others [Ingersoll (1984), others ??] claim it is. The primary reason for this disagreement is the approximate nature of the model. Are deviations from the exact model due to the approximation or are they genuine deviations from the model itself? The test then becomes a joint test of the model and the additional assumptions needed to impose the exact pricing relation. The APT and ICAPM are not empirically distinguishable. The “pervasive factors” in the APT world can coincide with the “state variables” in the ICAPM world.
42
CHAPTER 2. ASSET PRICING
The test of the model requires estimation of both the factor loadings (B) and the factor prices (F). The two primary testing approaches differ in the order these variables are estimated. Cross-sectional tests estimate (B) in the time series, the use these estimates for a number of firms to estimate (F) in the cross-section. The time series tests perform the estimation in the reverse order. Fama and MacBeth (1973) provide the basic approach for the crosssectional test [see Section 2.8.3 for details]. Some of these tests estimate the factors statistically while others use economic specifications. Chen, Roll, and Ross (1986) specify five economic variables as factors: industrial production, unexpected inflation, changes in expected inflation, credit quality, and a term premium. The find that the specification is good in the sense that many of these factors are priced and additional factors such as the market return, consumption growth, and changes in oil prices are not priced. Chan, Chen, and Hsieh (1984) perform a study similar to CRR, but are also able to explain the size anomaly. However, Shanken and Weinstein (1990) reply that these two studies are sensitive to the portfolio formation used. Specifically, forming size-based portfolios at the end of the estimation period causes misestimation of the βs to show up systematically in the size portfolios, biasing the subsequent risk premium estimate. The time series test method was originally proposed by Black, Jensen, and Scholes (1972) Factor prices are estimated in the first pass, and their sensitivity in the second pass. The null bypothesis is that the intercept is zero (or α = (1 − Bi )λ0 in the absence of a riskless asset). In summary, the tests of the APT generally reject the model, but the APT seems to perform better than alternatives such as CAPM. The APT has been used in applications which offer indirect evidence of its success as well. In fund performance tests, the model indicates fund managers have negative Jensen’s alphas, which is a similar result from the CAPM models (the magnitudes differ though). In calculating the cost of capital, CAPM and APT yield similar results. In event studies the APT does not seem to offer much gain over a single factor model.
2.8.6
Present Value Relations
The history of volatility and returns tests result in a flip-flop of results. The early variance bounds tests rejected the present value models, whereas the returns tests failed to reject. More recently, volatility bounds tests provide
2.8. EMPIRICAL ASSET PRICING
43
mixed evidence, but the returns tests now reject the model. Volatility tests Denote the “perfect foresight” price p∗t
=
∞ X
β τ dt+τ
τ =1
Then pt = E[p∗t ] or p∗t = E[p∗t ] + εt = pt + εt var(p∗t ) = var(pt ) + var(εt ) ≥ var(pt )
(2.23)
This says actual prices should be less volatile than the “model” price from the dividend series. In fact, we find the opposite. Actual prices are more volatile than would be expected from dividends. There are several problems with the above test. First, the price series is nonstationary so it needs to be modified. Second, the infinite sum is a problem in a finite sample. This can be overcome by including a terminal value in the distant future. Third, the observed dividend series is not series of independent observations, but rather a single realization. This creates a small sample problem in implementing the test. Fourth, there is no way to capture time-varying expected returns in this framework. Finally, different specifications of the investors’ information sets lead to different critical values, making interpretation difficult. In summary, there are several necessary adjustments to the variance bounds test. Even after making these adjustments, there is no way to hold size constant so there is no way to meaningfully compare the power of this test to alternatives. Shiller (1981) uses the perfect foresight price decomposition to derive varaince bounds. He finds the actual price is five to thirteen times more volatile than the perfect foresight price. His analysis indicates that the price change volatility is highest when information about dividends is revealed smoothly. Large, occasional information releases result in prices with lower variance but higher kurtosis.
44
CHAPTER 2. ASSET PRICING
Returns tests Tests of long horizon returns have found that there is siginificant negative autocorrelation over the three to five year horizon, indicating a tendancy for mean reversion. Orthogonality tests A model-free version is not subject to the nuisance parameter problem which plagues the variance bounds test. Both the model-free and the model-based orthogonality tests are better-behaved econometrically than the returns tests.
Chapter 3 Fixed Income 3.1
Introduction
The pricing of bonds differs from pricing other assets such as equity primarily because bonds are nonlinear. A bond has: 1. fixed, known maturity 2. fixed, known terminal (face) value 3. fixed, known periodic cash flows 4. more thinly traded (at least “older” issues) Term structure models can be viewed as time series models of the stochastic discount factor. Duration, Convexity
3.2
Term Structure Basics
3.3
Inflation and Returns
3.4
Forward Rates
Forward rates had been viewed simply as forecasts of expected future spot rates (PEH). Fama (??) shows that the forward rates also contain expectations of the premium above one month T-bills. • Holding period return is the change in log price on a particular bond from one period to the next. 45
46
CHAPTER 3. FIXED INCOME • The forward rate is the difference in the log prices of bonds of different maturities at the same point in time. • Premium is the holding period return less the one month spot rate.
Fama (1984) Fama uses a regression approach to separate the information about expected future spot rates from information about the expected premium. 1. premium = f (forward - spot) 2. ∆ spot = f (forward - spot) Results are that forward rates can predict premiums which vary through time and the expected future spot rate up to five months out. Froot has a response to Fama’s finding, suggesting that Fama ignores systematic expectations errors. Fama–Bliss (1987) Find that forward rate forecasts of near-term changes in interest rates are poor, but forecast power increases at longer time horizons. Interpret this as evidence of a slow mean-reverting process. Also find evidence of time-varying expected premiums, and that the ordering of risks and rewards changes with the business cycle. Stambaugh (1988) An affine yield model implies a latent variable structure for bond returns. Fewer state variables than forecasting variables puts testable restrictions on forecasting equations for bond returns. Reject CIR with non-matched maturities (avoids measurement error). Addresses source of errors, their consequences, and how the choice of instruments affect the outcome of the tests
3.5
Bond Pricing
As any asset, bonds can be priced using the pricing kernel approach presented in Section 2.5. Begin with the fundamental pricing equation 1 = Et [Mt+1 Rn,t+1 ].
3.6. AFFINE MODELS
47
The uppercase M is used to distinguish it from logs and the n subscript indicates the time to maturity. The return can obviuosly be expressed as the relative price change Rn,t+1 = Pn−1,t+1 /Pn,t . Substituting this into the pricing equation gives Pn,t = Et [Mt+1 Pn−1,t+1 ]. Recursive substitution and the fact that the bond is worth a dollar at maturity gives another representation Pn,t = Et [Mt+1 . . . Mt+n ]. In this light a bond pricing model is really a time series model of the stochastic discount factor. Fixed income models are broadly categorized as either stochastic interest rate models or stochastic term structure models. Stochastic interest rate models begin by specifying a process dr for the short rate. The problem with this approach is that the model price of the bond may not equal the market price. The short rate process also implies prices for bonds of other maturities and these may be mispriced as well. The stochastic term structure models use the observed market prices and estimates of the volatility structure to infer the stochastic process of the short rate. This information is then used to get a distribution for the bond price.
3.6
Affine Models
Affine yield models represent a class of realtively simple models in which all relevent variables are conditionally log-normal and log yields are linear in state variables. Affine forward rates imply affine yields. Taking logs of the pricing relation 1 pn,t = Et [mt+1 + pn−1,t+1 ] + var(mt+1 + pn−1,t+1 ). 2 A model with k state variables implies that the term structure can be summarized by the levels of k bond yields at each point in time and the constant coefficients relating the bond yields. In this sense affine yield models are linear; they are non-linear in the evolutionary process of the k basis yields and the relation between the cross-sectional coefficients and the underlying parameters of the model.
48
CHAPTER 3. FIXED INCOME
Table 3.1: Single Factor Stochastic Interest Rate Models dr = (α + βr)dt + σr γ dZ Model Merton (ABM) Vasicek CIR SR Courtadon Dothan GBM CIR VR CEV Duffie-Kan
α
β 0
0 0 0 0
0 0
γ 0 0 1/2 1 1 1 3/2
Specification αdt + σdZ (α + βr)dt + σdZ √ (α + βr)dt + σ rdZ (α + βr)dt + σrdZ σrdZ βrdt + σrdZ σr 3/2 dZ βrdt + σr γ dZ 1/2 (α1 + β1 r)dt + (α2 + β2 r)γ dZ
Assumptions • distribution of the SDF is conditionally lognormal; • bond prices are jointly lognormal with the SDF; • (additional strong assumptions): homoskedastic mt+1 (Vasicek) Properties • Log prices (and yields) are affine in state variables. • Analytic solution of pricing equations (outside affine yield generally requires numerical solutions e.g., Black, Derman, and Toy). • Trivial rejection of model without addition of an error term. • Limits the way in which interest rate volatility can change with the level of interest rates. • Implies risk premia on long bonds always have the same sign (singlefactor). • Applies to real bonds only ? • The model can be renormalized so that the yields themselves are the state variables (e.g., a two-factor model would use two yields).
3.6. AFFINE MODELS
3.6.1
49
Vasicek dr = κ(θ − r)dt + σdB y1t = xt − β 2 σ 2 /2 and − pnt = An + Bn xt
To get this model begin by writing the sdf as a forecast and an innovation −mt+1 = xt + εt+1 . The sign is a convention. Assume that xt+1 follows an AR(1) process and, for simplicity, its innovations are uncorrelated with εt+1 xt+1 − µ = φ(xt − µ) + ξt+1
and εt+1 = βξt+1 .
Now consider the log price of a one period bond
• • • • • • • • •
1 1 p1,t = Et [mt+1 ] + var(mt+1 ) = −xt + β 2 σ 2 = −y1,t . 2 2 Allows interest rates to be negative (OK for real, not nominal). Can handle rising, inverted, and humped yield curves, but not inverted humped curves. Price of interest rate risk is a constant that does not depend on the level of the short rate. Interest rate changes have constant variance. Limiting forward rate can not be both finite and time-varying. Log forward rate curve tends to slope downwards unless β is sufficiently small. Random walk is a special case. B measures the sensitivity of the n-period bond return to the oneperiod interest rate (and the state variable). This sensitivity increases in maturity, and is always less than the maturity. Average short rate is µ − β 2 σ 2 /2.
3.6.2
The CIR Model √ dr = κ(θ − r)dt + σ rdB
50
CHAPTER 3. FIXED INCOME
The basic CIR model is a general equilibrium, continuous time model of the real returns on the asset in an economy [see section 2.3.5]. The general model is specialized to the term structure in ?. The asset is used to smooth consumption, so its value depends on its hedging effectiveness, or its covariance with consumption. The model is derived in an option pricing framework by constructing a riskless synthetic portfolio, which must earn the riskless rate in equilibrium. The hedge portfolio is constructed of bonds of differing maturities; it is assumed that the market price of risk is the same for bonds of all maturities. A recursive approach must be used to solve the model. Although the model claims to endogenously derive the interest rate process, it is a direct consequence of the specification of the state variable. Assumptions • identical individuals with time-additive log utility (Dunn and Singleton relax this assumption but do not have much success) • xt+i and mt+i are normal conditional on xt for i = 1, but non-normal for i > 1. • y1t = −p1t = xt (1 − β 2 σ 2 /2) y1t is proportional to the state variable and its conditional variance is proportional to its level. • restricts interest rates to be positive Predictions • Variance proportional to the state variable. • All bond returns are perfectly correlated (general prediction of all single-factor models). • Prices are a deterministic function of the parameters, the short rate, and maturity; an error term must be specified to keep the model testable. • The long rate converges to a constant. • Stable parameters (λ, κ, θ, σ). • Forward rate fnt = −Bn2 xt σ 2 /2 • time variation in term premia ?
3.6.3
Duffie-Kan Class
The Duffie-Kan model is the most general affine model possible. It nests all the common models as special cases. p dr = κ(θ − r)dt + α + βrdZ
3.7. MULTI-FACTOR MODELS
3.6.4
51
Other Single Factor Models
HJM Ho-Lee BDT
3.6.5 • • • • • •
Alternatives
Non-linear models (γ = 3/2) Non-parametric models Markov switching models GARCH Higher-order ARMA processes Several state variables
3.7
Multi-Factor Models
Longstaff and Schwartz (2–factor) 1/2
−mt+1 = x1t + x2t + x1t εt+1 • • • • • •
p1t = −x1t − x2t + x1t β 2 σ12 /2 second factor (instantaneous variance of changes in short rate) avoids implication that all bond returns are perfectly correlated variance of innovation to log SDF is proportional to the level of x1t and is conditionally correlated with x1t but not with x2t . One-period yield is no longer proportional to x1t and the short rate alone is no longer sufficient to describe the state of the economy. The model is a generalization of the square-root model it can also generate inverted humped yield curves. Whenever the SDF can be expressed as the sum of two independent processes, the resulting term structure is the sum of the term structures that would exist under each of these processes.
3.8 3.8.1
Empirical Tests Brown & Dybvig (1986)
• Nominal, prices, cross-sectional, ML
CHAPTER 3. FIXED INCOME
Table 3.2: Summary of Empirical Results Methodsb C,ML C,ML
Notes iid errors
CKLS (1992) Y,N
TS,GMM
assume normality
GR (1993)
Y,R
TS,GMM
PS (1994)
P,N
TS,ML
LS (1992)
Y,N
C,GMM
forecast R from N use non-central χ2 second factor for inflation non-central χ2 second factor for volatility estimated with GARCH
Paper BD (1986) BS (1994)
a
52
b
Dataa P,N P,R
Price or Yield; Nominal or Real. Cross-section or Time series, Econometric Method.
Results rˆ > r, σ not constant Unstable est., don’t support mean reversion, σ > 0 binds reject γ < 1, unconstr. γ = 1.5, mean reversion not important fail to reject CIR, plausible estimates, fit short bonds better unstable/unrealistic estimates, reject original and two factor CIR reject single factor model, 2–factor holds for short and int bonds
3.8. EMPIRICAL TESTS
53
• Assume pricing errors are iid - a strong assumption given the differences in trading frequency across maturities; an alternative is to assume variance increases with maturity and is correlated across maturities. • Estimated r systematically overstates implied short rates (recall Fama MacBeth; Merton’s model of heterogeneous information sets). • Find estimated variance is erratic, although similar in magnitude to CIR weekly time series estimates. √ ¯ • find annual average of implied standard deviation (σ ˆ rˆ) appears to be an unbiased predictor of time series estimate of the standard deviation of changes in the short rate. • Bills appear to be better described by the model than bonds. • Discount issues’ prices are underestimated, premiums are overestimated. • Evidence that the errors are not iid.
3.8.2
Brown & Schaefer (1994)
• • • • • •
Real, prices, cross-sectional, ML CIR model is generally able to replicate observed yield curve shapes Pricing errors are generally within the bid–ask spread Parameter estimates are unstable, especially κ + λ Positivity constraint on σ 2 binds in many cases Cross-sectional estimates of variance are not unbiased estimates of the time series estimates. • evidence on mean reversion is generally not supportive
3.8.3
Chan, Karolyi, Longstaff & Sanders (1992)
CKLS present a generalized model that nests eight popular interest rate processes. • • • •
dr = (α + βr)dt + σr γ dZ Nominal, yields, time series, GMM The γ term seems to be the most important; models with γ < 1 are all rejected, and those with γ = 1.5 fare the best. The unrestricted estimate of γ is 1.5, and is significantly different than unity. The mean reversion process, which adds considerable complexity to the model, does not appear to be of major importance. Results are trouble for single-factor affine yield models: without mean reversion, the term structure may increase initially, but will then be downward sloping. Second, with γ > 0.5, the models become intractable and must be solved numerically.
54
CHAPTER 3. FIXED INCOME
3.8.4
Gibbons & Ramaswamy (1993)
• Forecast real returns on nominal bonds in a time series setting (assume inflation is independent of the real SDF ?) • GMM in a time series • Fail to reject CIR, obtain plausible parameter estimates • Reject with off-the-run bonds (measurement error and a small sample). • Model fits short end of term structure better than longer maturities. • Find some evidence of autocorrelation in returns
3.8.5
Pearson & Sun (1994)
• Nominal, prices, time series, ML • Generalize square-root model to allow the variance of the state variable to be linear in the level of the state variable. • Also include a second factor — expected inflation. • Reject original and two-factor CIR model. • Unrealistic parameter estimates: • Unstable parameter estimates (across datasets). • Within sample prediction has no power and is little better than a naive prediction of current values.
3.8.6 • • • •
Longstaff & Schwartz (1992)
second factor for volatility estimated using GARCH test cross-sectional restrictions with GMM Find model holds for both short-and intermediate-term maturities Reject single-factor model
Chapter 4 Derivatives 4.1
Introduction
Virtually all derivatives pricing is based on some sort of arbitrage argument. This chapter outlines derivative pricing in terms of both discreteand continuous-time models. Several derivations of each model are given to show the links between them. More advanced topics are covered rather superficially.
4.2
Binomial Models
Binomial option pricing is a special case of Arrow-Debreu pricing presented in Section 2.4.1. Standardize the price of an asset to have a price of $1, value in the “up state” of u, and value in the “down state” of d. Recall Xφ = p and X0 α = b so u d φ1 1 = 1 1 φ2 pf Solving these equations, φ1 =
Rf − d Rf (u − d)
and φ2 =
u − Rf Rf (u − d)
Rf − d u−d
and π2 =
u − Rf . u−d
π1 =
55
56
CHAPTER 4. DERIVATIVES
The binomial model is based on a replication argument. Consider positions in a stock and bond such that the portfolio replicates the payoffs on an option in the next period. That is, we want to find holdings in the stock and bond ∆ and B so the price of the position is C u in the up-state and C d in the down-state Su∆ + Rf B = C u
and Sd∆ + Rf B = C d .
Solving these equations gives ∆=
Cu − Cd S(u − d)
and B =
uC d − dC u C u − Su∆ . = Rf (u − d) Rf
The stock holding ∆ has the interpretation of the partial derivative of the call price with respect to the stock price. The current price of the option is C = ∆S + B = [πC u + (1 − π)C d ]/Rf . To implement this approach we need to calculate u and d. The formal specification is ! ! √ r √ r µa τ σa τ 1 − θ µa τ σa τ θ and d = exp u = exp + √ − √ n n θ n n 1−θ but a “shortcut” specification is √ σa τ u = exp √ n
√ σa τ and d = exp − √ . n
The subscript a indicates annual figures and continous compounding should be used. The life of the option is τ and there are n periods in the binomial tree. The corresponding riskless rate is Rf = exp(ra τ /n). Solving for the price of the option uses a recursive algorithm. At the expiration of the option the value is given by C = (ST − K, 0)+ . Using these values, the option price at T − 1 can be calculated. Stepping backwards through the tree gives the initial option price. To get the price of a European put option, put-call parity can be used. This is an arbitrage argument that requires S + P = C + Ke−rτ .
4.2. BINOMIAL MODELS
57
Table 4.1: Early Exercise of American Options
d=0 d>0
Call Never Before ex-date
Put In the money After ex-date
Volatility does not enter the equation directly since is affects the put and call in the same way. If the option is American it is necessary to check for early exercise at each node in the tree. To do so simply uses C = (Ch , Cx )+ where the h indicates the hold value as calculated above and x is the early exercise value. Early exercise is never optimal for a call on a stock that does not pay dividends. For the put to be exercised early it must be sufficiently in the money. If the stock does pay dividends, calls may be exercised just before the ex-date and puts just after the ex-date. The number of steps in the tree affect the answer for the option price. The model value converges to the true value as the number of nodes gets large, but at a computational expense. The model price generally changes very little after about a hundred steps. There is an “odd-even” effect where the calculated value oscillates between over- and under-valued as the number of nodes in incremented. To remove this error, you can use a weighted average of prices calculated at n − 1, n, n + 1 nodes.
4.2.1
Alternative Derivations
CAPM-based derivation The standard CAPM result is E[ri ] = rf + βi (E[rm ] − rf ]) 2 and βi = σim /σm = ρim σi /σm . Let λi = ρim [E[rm ] − rf ] /σm , the correlationadjusted market risk premium. Rewriting the CAPM relation,
E[ri ] = rf + λi σi
58
CHAPTER 4. DERIVATIVES
or E[Ri ] = Rf + λi σi . Now assume asset i is an option written on a stock whose returns follow a binomial process. P u and P d are the end-of-period state prices, with θ the true probability of the up-state. The current price is given by P . Then E[Ri ] =
θP u + (1 − θ)P d P
and σi =
Pu − Pdp θ(1 − θ). P
Substituting these expressions into the modified CAPM expresion and rearranging yields P =
P u π + P d (1 − π) Rf
p where the risk-neutral probability π = θ − λi θ(1 − θ) is a function of the true probabilities and the correlation-adjusted market price of risk. To avoid arbitrage, all assets must be priced with the same risk-neutral probabilities. Every dollar investment in the stock should be priced according to 1=
uπ + d(1 − π) . Rf
Rearranging and solving for π gives π=
Rf − d . u−d
Note that when λi = 0, π = θ. This happens when investors are actually risk-neutral or if the security is uncorrelated with the market. With λi > 0 the risk-neutral probabilities overstate the true probabilties in unfavorable states and understate the truth in favorable states.
4.2. BINOMIAL MODELS
59
Relation to Black Scholes Subscripts u and d index the up- and down-states, while all other subscripts denote partial derivatives. Begin with the single period binomial option pricing equation C=
V u π + V d (1 − π) Rf
where √
Rf − d erτ − e−σ τ √ . π= = σ √τ u−d e − e−σ τ Assume a 50% probability of the up-state to get u = eσ Re-expressing V u and V d V u = C(eσ
√ τ
S, t + τ ) and V d = C(e−σ
√ τ
√
τ
and d = e−σ
√
τ
.
S, t + τ ).
Substitute into the binomial equation C(S, t) =
(erτ − e−σ
√ τ
)C(eσ
√ τ
√
S, t + τ ) + (eσ τ − erτ )C(e−σ √ √ erτ [eσ τ − e−σ τ ]
√ τ
S, t + τ )
.
Next, perform several Taylor series expansions. ∆S u = (eσ
eσ
√
τ
C(eσ
√
τ
− 1)S
∆S d = (e−σ
√ 1 = 1 + σ τ + σ2τ 2
e−σ
√ τ
√
S, t + τ ) = C + (eσ
τ
√ τ
√
τ
− 1)S
∆t = [(t + τ ) − t] = τ
√ 1 = 1 − σ τ + σ2τ 2
erτ = 1 + rτ.
1 √ − 1)SCS + (eσ τ − 1)2 S 2 CSS + τ Ct 2
and similarly for the down state. Substituting all this into the expanded binomial formula and simplify by cancelling like terms and drop terms involving higher orders of τ gives the Black-Scholes PDE 1 Ct = rC − rSCS − σ 2 S 2 CSS . 2
60
4.2.2
CHAPTER 4. DERIVATIVES
Trinomial Models
Multinomial models are based on matching risk-neutral moments. For example, the trinomial model requires three probabilities, pu , pm , and pd . If the stock price process is dS 1 = (r − σ 2 )dt + σdW = αk + σdW S 2 then E[dS/S] = αk and var(dS/S) = α2 k 2 + σ 2 k. Three equations are used to solve for the three unknown probabilities pu h + pm 0 + pd (−h) = αk p u h2 + p m 0 2 + p d h2 = α 2 k 2 + σ 2 k pu + pm + pd = 1. The resulting answers are 2 k 1 2 k 2k σ 2 +α 2 +α pu = 2 h h h 2 k k pu = 1 − σ 2 2 − α 2 2 h h 2 k 1 k k 2 2 pd = σ 2 +α 2 −α 2 h h h
4.3
Black Scholes Model
The famous Black and Scholes (1973) option pricing model and its extensions by Merton (1973) has revolutionized derivative pricing.
4.3.1
Black Scholes Derivations
Derivation I: Replication Assume the stock price follows GBM dS = µSdt + σSdW and there is a riskless asset B = ert . The option price depends on the stock price and time C(S, t). Using Ito’s Lemma 1 dC = Ct dt + CS dS + CSS (dS)2 . 2
4.3. BLACK SCHOLES MODEL
61
Making the substitutions gives 1 dC = (Ct + µSCS + σ 2 S 2 CSS )dt + σSCS dW 2 = µC Cdt + σC CdW. Form an arbitrage portfolio with investments wS + wC + wB = 0. The return on this investment is dS dC dΠ = wS + wC + wB rdt Π S C = wS [µdt + σdW − rdt] + wC [µC dt + σC dW − rdt] = [wS (µ − r) + wC (µC − r)]dt + [wS σ + wC σC ]dW Choose wS and wC such that there is no risk, wS σ + wC σC = 0. With no risk, wS (µ − r) + wC (µC − r) = 0 to avoid arbitrage so µ−r µC − r = = λ, σ σC the market price of risk. Making the substitutions (Ct + µSCS + 21 σ 2 S 2 CSS )/C − r µ−r = . σ σSCS /C Simplifying gives the PDE 1 Ct = rC − rSCS − σ 2 S 2 CSS . 2 Derivation II: Using CAPM An alternative derivation uses the CAPM. The beta of an option is a function of the stock beta and the elasticity of the option price with respect to the stock price βC = β S C S
S . C
The expected return on the stock and option are dS dC E = (r + αβS )dt = µdt and E = (r + αβC )dt = µC dt. S C
62
CHAPTER 4. DERIVATIVES
Making the substitution, E[dC] = (rC + αSCS βS )dt. By Ito’s Lemma 1 ˆC Cdt + σC CdW. dC = (Ct + µSCS + σ 2 S 2 CSS )dt + σSCS dW = µ 2 Taking expectations and setting the two expressions equal gives the Black Scholes PDE 1 Ct = rC − rSCS − σ 2 S 2 CSS 2 Solving the PDE The following method makes use of the Feynman-Kac (Cox-Ross) solution. The boundary condition is C(ST , T ) = (ST − K)+ . C = E Q [e−rτ (ST − K)+ ] = e−rτ E Q [(ST − K)+ ]
= e−rτ E Q [ST |ST ≥ K]Prob[ST ≥ K] − Ke−rτ Prob[ST ≥ K].
Next, get the conditional distribution ln ST | ln St ∼ N (ln St + (r − σ 2 /2)τ, σ 2 τ ) = N (m, v 2 ). The density is1 ∂ ln ST f (ST |St ) = f (ln ST | ln St ) · ∂ST (ln ST − ln St − (r − σ 2 /2)τ )2 1 1 exp − =√ 2σ 2 τ ST 2πσ 2 τ " 2 # 1 1 ln ST − m √ exp − = 2 v vST 2π 1
To derive this realize that under Q, dS = Srdt + SσdZ. Let x = ln S so dx =
1 ∂2x dS 1 1 ∂x dS + (dS)2 = − (dS)2 = (r − σ 2 )dt + σdZ. ∂S 2 ∂S 2 S 2S 2 2
4.3. BLACK SCHOLES MODEL
63
Next calculate the terms involving ST and K Prob[ST ≥ K] = Prob[ln ST ≥ ln K] = 1 − Prob[ln ST ≤ ln K] ln K − m m − ln K =1−φ =φ v v 2 ln(ST /K) + (r − σ /2)τ √ =φ = N (d2). σ τ Using the same idea and a change of variable y = ln ST so ey = ST and dST = ey dy E Q [ST |ST ≥ K]Prob[ST ≥ K] " 2 # Z ∞ 1 1 1 ln ST − m ST √ exp − = dS(T ) 2 v ST v 2π K " 2 # Z ∞ 1 ln ST − m 1 √ exp − = exp(ln ST )d ln S(T ) 2 v ln K v 2π Z ∞ 1 1 2 2 2 √ exp − 2 ln ST − (m + v ) + m + v /2 d ln S(T ) = 2v ln K v 2π " 2 # Z ∞ 2 1 ln S − (m + v ) 1 T √ exp − d ln S(T ) = exp(m + v 2 /2) 2 v ln K v 2π m + v 2 − ln K ln K − (m + v 2 ) 2 = exp(·)φ = exp(m + v /2) 1 − φ v v 2 ln(St /K) + (r − σ /2)τ + σ 2 τ 2 2 √ = exp ln St + (r − σ /2)τ + σ τ /2 φ σ τ = Serτ N (d1) Combining these results gives the Black Scholes model C(S, t) = SN (d1) − Ke−rτ N (d2) where d1 =
ln(S/K) + (r + σ 2 /2)τ √ σ τ
√ and d2 = d1 − σ τ .
64
CHAPTER 4. DERIVATIVES
4.3.2
Implied Volatilities
The volatility parameter is the most difficult to obtain and perhaps the most important. An alternative to using the model to give an option price is to invert the model to give an implied volatility, taking option prices as inputs.
4.3.3
Hedging
Hedging involves forming portfolios to reduce or minimize various types of risk. The most common hedge is a delta-neutral position. This investment has an expected price change of zero when the stock price changes — the loss from a drop in the stock is offset by a gain on an option. This is a local hedge, since the delta changes when the stock price changes. A gammaneutral hedge preserves the delta-hedge. Other hedges include rho for the interest rate and vega for volatility. Again, these are partial hedges and assume everything else is constant. To determine the appropriate hedge, find the options with the maximium . Buy and minimum pricing error per unit of stock-equivalent risk model−market ∆ and sell these options in amounts proportional to the inverse of the delta to balance the stock-equivalent risk. For a gamma hedge, combine two deltaneutral portfolios such that the gammas balance.
4.4
Advanced Topics
4.4.1
American Options
Boundary Conditions Define St∗ as the exercise boundary. Conditions for early exercise require lim C(St ) = St∗ − K
St →St∗
lim ∗
St →St
∂C(St ) = 1. ∂St
With dividends the stock process is dS ˜ = (r − δ)dt + σdW S
4.4. ADVANCED TOPICS
65
so 1 ˜ = rCdt + σC W ˜. dC = [Ct + (r − δ)SCS + σ 2 S 2 CSS ]dt + σC dW 2 The resulting PDE is 1 Ct + (r − δ)SCS + σ 2 S 2 CSS − rC = 0 2 with boundary conditions CT (ST ) = (ST − K)+
and C0 (S0 ) = sup E Q [e−r(τ −t) (S0 − K)+ ]. τ ∈[t,T ]
At the boundary you are indifferent to exercising since exercising generates dS + (δS − rK)dt while continuing generates 1 dC = (Ct + σ 2 S 2 CSS )dt + CS dS 2 = [rc − (r − δ)S]dt + dS = [r(S − K) − (r − δ)S]dt + dS = (δS − rK)dt + dS. To exercise you borrow rK and receive δS, so rK = δS and ST∗ = rK/δ. Integration Broadie & DeTemple and Barone-Adesi & Whaley. Let Ct and ct denote American and European call option values. We can write Ct (St ) = ct (St ) + εt so Q
C0 (S0 ) = E [e
−rt
+
Q
(ST − K) ] + E [ε
Z
T
e−rt df (τ )]. 0
CHECK
66
CHAPTER 4. DERIVATIVES
L-U Bound Broadie & DeTemple find upper and lower bounds on American options by using capped calls. A capped call value can be found for a given early exercise path. BBS/Richardson Extrapolation The binomial Black-Scholes method (BBS) is essentially a binomial tree with the analytic BS formula attached at the last node. This avoids some of the problems from disctretization in a tree, but preserves the ability to price American options. Richardson extrapolation involves calculating the price with N nodes and again with 2N nodes. The option price is then calcualated as twice the first minus the second value (e.g., p = 2pN − p2N ). This avoids the odd-even effect and allows use of a small N .
4.4.2
Exotic Options
Barrier options utilize the reflection principle. Put-call symmetry says C(S, K, r, δ) = P (K, S, δ, r). To price a down-and-out call, let H denote the barrier, xt = ln(St /S0 ), yt = inf t∈[0,T ] xt , Yt = supt∈[0,T ] xt , and y = ln(H/S0 ). Then Cdoc = e−rt E Q (ST − K)+ Prob[yT ≥ y]
= e−rt E Q [(S0 exT − K)Prob[yT ≥ y, xt > ln(K/S0 )]]
To price lookback options, Standard : C = (ST − MTT0 )+ P = (MTT0 − ST )+ Extreme : C = (MTT0 − K)+ P = (K − MTT0 )+
Asian options can be of the form C = (S¯ − K)+ ¯ + C = (S − K)
¯+ P = (K − S) ¯ − S)+ P = (K
4.5. INTEREST RATE DERIVATIVES
67
Table 4.2: Common Interest Rate Models Stochastic Interest Rate Rendleman & Bartter Courtadon Vasicek CIR
4.4.3
Stochastic Term Structure Ho & Lee HJM Black, Derman & Toy
Other Advanced Topics
Stochastic Volatility and Jumps Monte Carlo, QMC, etc. Parametric Pricing
4.5
Interest Rate Derivatives
An underlying assumption of the preceeding option models is that the asset follows a lognormal or binomial process. For fixed income securities this assumption is not valid. The price of these securities must end up back at par when they mature. Surprisingly, the fact that we know the terminal price makes option pricing more difficult. With the view that this is an additional constraint on the system it is more understandable why interest rate options have this added complexity. Many of the interest rate models are discussed more fully in Chapter 3. There are three basic steps in interest rate option pricing. First, random interest rates are modeled. Next, the distribution of the interest rates are used to infer the distribution of prices for the underlying debt instrument. Finally, the distribution of the underlying asset is used to price the option. Interest rate options can be broadly categorized as stochastic interest rate models and stochastic term structure models. Refer to Section 3.5 for a discussion. The stochastic interest rate approach is subject to error since the option model is based on bond prices that are potentially wrong. In the stocjastic term structure models, market data is used to get a distribution for the bond price, which, in turn, is used to price the option.
68
4.5.1
CHAPTER 4. DERIVATIVES
Stochastic Interest Rate Models
The Rendleman & Bartter model uses a binomial process and assumes interest rate changes are a constant percentage. Courtadon models the interest rate process in continuous time. Like the RB model, interest rates are lognormal. However, Courtadon adds a mean-reversion feature which overcomes the problem in RB that the interest rate can become infinitely large. The Vasicek model includes mean-reversion, but uses a normal process, allowing negative interest rates. The CIR model modifies Vasicek by using a square root process which produces variance proportional to the level of the interest rate. Refer to Table 3.1 for a summary of model specifications.
4.5.2
Stochastic Term Structure Models
The presentation of the following models are based on their discrete time analogs. Ho & Lee The Ho-Lee () model generates parallel shifts in the yield curve. In a binomial setting it produces a recombining tree. It is based on Rt,j =
D (t) [π + (1 − π)δ t ] D (t+1) δ t−j
where δ = exp[−2φ(τ /n)1.5 ], D (t) is the current price of the bond maturing at time t, and φ is the standard deviation of the log yield of one-year discount bonds. BDT The Black, Derman, & Toy () model features a fixed ratio of adjacent prices at each point in time, αt . The rate can be expressed as rt,j = αj rt,0 . In the Ho-Lee model this ratio is fixed for all t. Heath, Jarrow, & Morton The Heath, Jarrow & Morton model is the most general term structure model. Although the HJM model is set in continous time, a discrete time analog is available. Market bond prices and volatilities are used to determine the
4.5. INTEREST RATE DERIVATIVES
69
m interest rate process (tree). In terms of notation, let Dt,k denote the price of a bond maturing at m observed at time t in state k. Since the tree does not recombine, there are 2t nodes at time t. The states are indexed with the lowest (all downs) state as 0 and the highest state as 2t − 1. By convention, up states are when the bond price increases (and interest rate decreases). The risk neutral and true probabilities of an up-state are π and θ. There are two equations expressing current price and volatility as functions of the next period prices. m Dt,k
m m πDt+1,2k+1 + (1 − π)Dt+1,2k = 1 + rt,k
ln σt+1 =
1
m Dt+1,2k
− ln
1
m Dt+1,2k+1
2(m − t − 1)
The second equation can be more conveniently expressed as m m Dt+1,2k+1 = exp[σt+1 · 2(m − t − 1)]Dt+1,2k m and σt+1 are generally estimated (or given) and the prices The values Dt,2k at t + 1 are determined by solving the equations simultaneously at each node. Unlike the standard binomial pricing model, this model is solved by stepping forward through the tree.
70
CHAPTER 4. DERIVATIVES
Chapter 5 Corporate Finance 5.1
Introduction
Corporate finance covers a range of issues related to the choice of capital structure, distributiuon of cashflows, and issuance of securities. Asymmetric information problems are common and is the subject of much of the work in corporate finance. Also important are the agency costs that arise from the conflicts of interest between the decision makers and other parties. A common example of an agency costs is between the manager and the outside owners of the firm. The compensation contract offered to managers is one way of dealing with this agency cost. Many of the earlier works make relatively strong assumptions. The last few sections attempt to understand the implications of relaxing these assumptions. When demand for assets is not perfectly elastic there will be price effects caused by changes in quantity. Similarly, imperfections may give rise to financial innovation.
5.2
Information Asymmetry/Signaling
An information asymmetry occurs when one group of agents has better information than other groups. Adverse selection arises when an agent making a decision is better informed than the person with whom he is contracting. This is different than moral hazard, where the agent with superior information can influence the outcomes by his action. A signal is an action that an agent takes to provide credible information. The signal must impose a 71
72
CHAPTER 5. CORPORATE FINANCE
greater cost on the “low quality” agents than on the “high quality” agents to prevent mimicking. A common element of signaling papers in finance is that the source of the informational asymmetry generally comes from managers’ superior forecasts of future cash flows. Investors are typically homogeneous with respect to taxes and restrictions on trading, otherwise clienteles would arise. Models also usually prevent the manager from trading personally. The outcomes depend on the nature of the informational asymmetry. If the asymmetry is over the assets in place, but not the new project, overpricing and project scaling are the most efficient signals. If there is asymmetric information about the project’s value, and good firms have more valuable projects than bad firms, signals which burn money after the issuance are dominant. The overinvestment (when only project value is asymmetric information) can be eliminated through money burning, but the underinvestment problem (when there is differential information about the assets in place) is not completely solved. The distinction that the burned money comes from project cash flows is important. Equity financed money burning is an inefficient signal. Akerlof (1970) In his famous “lemons” paper, Akerlof (1970) shows how quality uncertainty can affect the size and average quality in the automobile market. In extreme cases, markets can fail completely. A case is made for the role of certain institutions in improving the efficiency of the market. The seller of the cars know more about the quality of the car than the buyers. Demand depends on price and average quality, QD = D(p, µ). The average quality will also depend on price, µ = µ(p). Supply depends on price as well, QS = S(p). In equilibrium, S(p) = D(p, µ(p)). A low average quality will cause the owners of good cars not to sell, further lowering the average quality. Several applications of the basic model are discussed. In the insurance market, healthy individuals will tend to opt out of the market, leaving the insurer with a disproportionately large share of the unhealthy. Costs of dishonesty must include both the direct costs as well as the indirect costs of driving business out of the market. There are several institutions that can mitigate these types of problems. Risk transferring guarantees can allow the
5.2. INFORMATION ASYMMETRY/SIGNALING
73
owners of good cars to get their fair value. Brand-names and chains can also reduce quality uncertainty, as do licensing practices. Spence Spence (1973) develops the signaling model in the context of the job market. This is really just one example of an investment under uncertainty problem. Here the potential employer can not observe the quality of the applicants, but can use education to make rational inferences about the applicant’s quality. Education separates the types of applicants because it is costly to obtain, and more so for the low types. The high types will obtain just enough education to make it unattractive for a low type to mimic. The employers offer wage schedules that are a function of the educational signal and other (non-signal) indices. Individuals choose education levels to maximize wages net of signaling costs. For the signaling equilibrium to work the costs of signaling must be negatively correlated with productive capacity. Otherwise, lower quality types will overinvest in the signal to mimic higher quality types. The use of indices results in forming probability distributions conditional on both the signal and the indices. This segments the population by indices, and these subsets need not have the same equilibrium. Spence (1974) is a more general description of the signaling environment. There must be an information asymmetry where the seller knows more about the good than the buyer. The seller signals and the buyer responds. The signal is based on the anticipated response of the buyer. Myers & Majluf (1984) Myers and Majluf (1984) is the classic paper on financing under asymmetric information. A firm has a positive NPV project that requires external financing. If the manager believes the stock is underpriced, pursuing the project requires issuing underpriced stock, diluting the value to existing shareholders. Investors will then believe that when a firm does issue, it is likely that the stock is overpriced. Consequently, announcements of new issues generate a share price decline. In the basic model, a firm has existing assets in place, a, and a valuable investment opportunity, b > 0. The project is all-or-nothing and requires the issuance of equity to make the initial investment I. The firm currently
74
CHAPTER 5. CORPORATE FINANCE
has slack S, which is fixed and publicly known, and would need to raise additional equity E = I − S. There are three dates in the model. At time t − 1 the market has the same information as management; valuations are ¯ At time t the manager learns a and b, while the market given by A¯ and B. ˜ Additional assumption are perfect knows only the distribution of A˜ and B. markets, costly signaling, and passive existing shareholders. Managers act in the interest of old shareholders by maximizing V0old = V (a, b, E). The market value of the shares will generally be different from the manager’s valuation since the market does not know a or b. Denote the market value P 0 if stock is issued and P otherwise. The managers will issue new stock when E/(P 0 + E)(S + a) ≤ P 0 /(P 0 + E)(E + b). In words this says the old shareholders must get more of the new value than the new shareholders get of the original value. The firm is more likely to issue when b is high or a is low. Rearranging, the indifference equation is b = (E/P 0 )(S + a) − E. Above this line the firm will issue and invest, below it will do nothing. The issue price P 0 is given by ¯ 0 ) + B(M ¯ 0) P 0 = S + A(M where the last terms represent expected values given issuance. Unless the firm is certain to issue, P 0 < P . The decision not to issue is good news about the value of the existing assets. The ex ante loss from ¯ ). With S > I, L = 0. If the passing up good projects is L = F (M )B(M firm could be split, then the problem goes away. The solution of P 0 requires ¯ Then a simple numerical algorithm. Start by setting P 0 = S + A¯ + B. 0 0 0 0 ¯ ¯ determine M and M and calculate P = S + A(M ) + B(M ). Repeat this procedure until convergence. The above analysis can be extended to include debt financing. If the firm can issue riskless debt the problem disappears — the firm always issues riskless debt and takes the project. If the firm can only issue risky debt the problem in reduced, but not eliminated. Thus, the general rule is to issue securities less subject to mispricing first. The firm will issue and invest only when b ≥ ∆D (or ∆E). We should have |∆D| < |∆E| and with the same
5.3. AGENCY THEORY
75
signs. In this case the firm will never issue equity. Any time it decides to issue it will use debt. This extreme condition can be tempered by introducing costs of debt such as bankruptcy or agency costs. Note that if the information asymmetry is about the variance of value rather than the mean, then equity will dominate debt. The model makes a number of predictions. It says it is generally better to issue safe securities, a pecking order result. Firms with insufficient slack may forgo good investment opportunities — the underinvestment problem. Firms can build up slack by retaining earnings or issuing securities when information asymmetries are small to avoid some of these problems. Firms should avoid issuing risky securities to pay dividends. Stock price will fall when managers have superior information and they issue securities. A merger between a firm with little slack and one with a lot of slack is likely to increase value, but negotiating such a merger is likely to be difficult. The basic Myers and Majluf (1984) framework has been extended in a number of ways, including dividend policy, scale of investment, project timing, and public offerings (overpricing and underpricing). Cooney & Kalay (1993) ? extend the classic Myers and Majluf (1984) paper to allow for the possibility of negative NPV projects. With this additional realism the stock price reaction to equity issuance is not necessarily negative. Note that it is the existence, not acceptance, of negative NPV projects that drives this result. In the new model, low values of a can cause overinvestment. Firms will accept negative NPV projects in order to sell overvalued existing assets. Also, firms with riskier new projects may experience stock price increases on issuance announcement. The revised model has lower issue prices and probability of issuance. When there is a limited supply of zero NPV projects (e.g., transactions costs and taxes for financial investments) there may be positive annoncement effects.
5.3
Agency Theory
An agency relationship is a contract under which a principal engages an agent to perform some task on his behalf which involves delegating some decisionmaking authority to the agent. The agent may not always have natural
76
CHAPTER 5. CORPORATE FINANCE
incentives to act in the best interest of the agent. The principal can address this problem by establishing the appropriate incentives and/or monitoring the agent. Incentive alignment is rarely free; agency costs are defined as the sum of monitoring costs, bonding costs, and the residual loss. Jensen & Meckling (1976) In a widely cited paper, Jensen and Meckling (1976) develop a theory of the ownership structure of the firm using elements of property rights, agency, and financial theory. Property rights specify how costs and rewards will be allocated among the participants in an organization. The firm is defined as a legal fiction which serves as a “nexus for contracting.” The firm has divisible claims on assets and cashflows, but does not have intentions, behaviors, or motivations. The paper focuses on the positive aspects of agency theory — the interaction of the various parties assuming they act optimally. Most of the previous literature was normative in nature. The presence of inside and outside equity owners introduces an agency cost of outside equity. This arises because the outsider funds a portion of the insiders perquisite consumption, so the insider will consume “too much.” As long as the market anticipates this all these costs will be passed back to the insider. The manager consumes perquisites F . When he is the sole proprietor he chooses to consume F ∗ and the firm is worth V ∗ . Every dollar of perqs he forgoes increases the value of the firm by a dollar. He chooses the point with the highest utility given his budget set. The manager sells a share (1 − α) to an outsider. Now the manager pays only $α for every dollar in benefits. If the outsider pays V ∗ and holding F ∗ constant, the budget slope changes to −α. If the manager can change his consumption, he will increase consumption which lowers the value of the firm. With rational expectations the market will foresee this and will pay only V 0 . At this point the manager is over-consuming perqs; by decreasing to F 0 he increases his utility and the value increases to V 0 . The entire decrease in value V ∗ − V 0 is borne by the insider. This is a gross cost, since it does not include the benefit from increased consumption. The net cost is given by the change in the utility levels. Introducing monitoring allows an improvement. The insider receives all the benefits from monitoring (i.e., he bears all the net costs). It does not matter who actually makes the payment for these costs since they all fall
5.3. AGENCY THEORY
77
back to the insider in the end. The outcome is suboptimal or inefficient only relative to a world with no agency costs. Given that these costs exist, and since the insider bears these costs, the insider will minimize these costs. The size of these agency costs will depend on managers’ tastes, degree of managerial discretion, monitoring and bonding costs, difficulty in measuring performance, and the costs of devising, implementing and enforcing incentive contracts. The scale of the firm can also be analyzed in this framework. When the insider lacks sufficient resources and needs external financing, agency costs reduce the value of the firm at a given level of fringe benefits consumption. The insider will stop increasing the value of the firm when the gross increment in value is offset by the incremental loss in the consumption of additional fringe benefits. The end result is that the insiders are worse off than before. The reason the can not be the same is because they can not credibly commit to not consuming additional benefits. Debt financing creates a risk-shifting incentive since equity holders enjoy the benefits of positive outcomes without a matching liability for negative outcomes, much like an option. This is an overinvestment problem — the firm takes bad projects because the equity holders can expropriate wealth from the bondholders. Again, monitoring and bonding are possible (partial) solutions, but are likely to be difficult to implement. In a multiperiod setting, “being good” will reduce agency costs due to a reputation effect. Yet the problem will not be solved since each agent has an end to their game and will always eventaully face the temptation to shirk. Inside debt may help reduce the problems as well since the manager will not be tempted to expropriate wealth from his own bonds. In some sense the manger’s salary may serve this purpose. He may take measures to preserve his salary, including pursuing safe investments. Incentive compensation, such as options, may be effective in this case. Convertible securities may incent managers to avoid risk-shifting. Security analysts may also help reduce agency costs. In situations where it is easy for the insider to consume perqs, less outside equity should be used. Jensen (1986) Jensen (1986) discusses how free cashflows (FCF) can cause agency costs by allowing managers discretion to make bad investments. Reducing FCF can minimize a manager’s ability to waste resources and it also subjects the firm
78
CHAPTER 5. CORPORATE FINANCE
to more frequent monitoring since it has to access the capital markets more often. The agency costs of debt have been cited as a reason to use less debt. Jensen points out that debt can also help reduce agency costs by reducing FCF. Debt can be viewed as a substitute for dividends in this sense. Additional debt will also serve to increase efficiency as bankruptcy becomes more likely. There is evidence supporting these claims. Leverage-increasing transactions are associated with increases in equity value. LBO targets tend to have high FCF and low growth opportunities. Also, strip, or mezzanine, financing limits the conflicts of interest between classes of security holders. The FCF hypothesis also applies to takeovers. Firms with high FCF and unused borrowing power are likely to undertake bad mergers. Takeovers, especially hostile ones, can generate the crisis needed to make changes. Within declining industries, mergers are likely to be value-enhancing since they remove resources from a relatively unproductive sector. Acquirers tend to have performed well, generating excess cash to pursue the acquisition. Targets tend to either have poor managers and poor performance or good performance and significant FCF. Cash or debt financed takeovers generally provide larger benefits than transactions financed with stock. Fama (1980) Fama (1980) explains how the separation of ownership and control in a large corporation is an efficient organizational form. The basic idea is that management is a special type of labor which coordinates inputs and makes decisions. Management rents its human capital to the firm. Risk bearers provide capital ex ante in exchange for uncertain future payments. The capital markets and managerial labor markets provide discipline to the manager. Monitoring occurs within and among management, up and down the chain of command. The board monitors top management; it can include top management but should also include outsiders. A distinction between ownership of the firm and ownership of capital is made. Since the firm is a collection of contracts, no one really owns it. Rather, security holders own claims on the cashflows. With this view, control rights over a firm’s decisions does not necessarily lie with the security holders. In order to hold the manager accountable there must be some mechanism for ex post settling up. The general necessary conditions are uncertainty about managerial talents or tastes, labor markets that efficiently use past
5.4. CAPITAL STRUCTURE
79
information in determining wages, and a wage revision process that is strong enough to resolve incentive problems. When the manager is the sole proprietor he can not avoid ex post settling up with himself. The optimal pay incentives are effort based. This does not expose the manager to risks beyond their control for which they would demand compensation. The problem is that effort is difficult to measure. When performance measures are noisy, less weight should be put on recent results. Lehn & Poulsen (1989) Lehn and Poulsen (1989) analyze FCF in privatizing transactions to identify the sources of value. Unlike other corporate control transactions, synergies are not a potential source of value. The four sources under consideration are tax effects, wealth redistribution, asymmetric information, and agency costs. The results are largely consistent with the FCF hypothesis. These transactions are more likely in firms with high CF/EQ or low sales growth. Premiums paid are also positively related to CF/EQ. The results are strongest in the hostile takeover wave in the mid-80’s and among firms with low management ownership. The analysis consists of two parts. First, firms that went private are contrasted to a control sample that did not to understand the factors important in the decision. This is done by comparing means of the groups and also in a logit regression. The variables of interest are CF/EQ, Tax/EQ, sales growth, and footsteps, a dummy for competing bids or rumors. The results indicate that privatized firms are larger, have more cash, slightly lower recent growth, and are more likely to have other bids. These effects tend to become stronger in the second half of the sample. The second part of the paper attempts to explain the cross-sectional variation in premiums in these transactions by regressing the premium on CF/EQ, Tax/EQ, and sales growth. The results are generally supportive of the FCF hypothesis, especially in the second half of the sample and among low management ownership firms.
5.4
Capital Structure
The capital structure decision balances the costs and benefits of the various financing choices. These can be categorized as taxes, bankruptcy costs, and agency costs. Tax considerations include both the advantages of debt
80
CHAPTER 5. CORPORATE FINANCE
at the corporate level as well as the disadvantage at the individual level. Bankruptcy costs associated with debt are subdivided into direct and indirect components. Agency costs arise from the conflicts of interest between different investor classes and also with management. Although debt creates several agency costs, it actually reduces agency costs under the FCF hypothesis. Agency costs can lead to underinvestment or overinvestment. Miller (1977) This paper is a study of the way taxes affect capital market equilibrium. Pre-“Debt and Taxes” the view was that optimal capital structure involved balancing the corporate tax advantage of debt against the costs of financial distress (loss of tax shields, overinvestment, underinvestment, monitoring costs, etc.). Miller’s “horse and rabbit stew” refers to the corporate tax advantages of debt dominating the costs associated with bankruptcy. Miller adds personal tax considerations of investors to the mix. Taxes are important in the capital structure decision because they affect aggregate supply and demand for corporate securities. Using a bond market equilibrium analysis, Miller argues that the higher costs of borrowing negate the entire benefit of tax shields so the capital structure choice is irrelevant for individual firms, although there will be an optimal amount of aggregate debt. With progressive corporate taxes and/or if the differential information-related costs of debt versus equity are convex in the amount of debt, then capital structure may in fact matter. The classic M&M Proposition I is modified to include personal taxes (1 − τC )(1 − τS ) VL = V U + 1 − B. (1 − τB )
Proposition I (with taxes) says that firms can increase value by issuing debt. But if this is the case then the market is not in equilibrium. Assume for simplicity that there are no capital gains taxes, all bonds are riskless, and there are no transactions costs, Miller’s equilibrium is given by the curves S and D in Figure 5.1. The flat part of the demand curve represents the demand for taxable bonds by tax-exempt investors. To get taxable investors to hold bonds, the rate must be high enough to offset the taxes. The equilibrium is where τC = τB . In the more general case, with capital gains taxes, the equilibrium condition is (1 − τC )(1 − τS ) = (1 − τB ).
5.4. CAPITAL STRUCTURE
81
R D = r0 /(1 − τB ) S = r0 /(1 − τC ) r∗
S1 = r0 /(1 − τC0 )
r0
S2 = r0 /(1 − τC0 ) − d Q∗
Q
Figure 5.1: Bond Market Equilibrium The area between the supply and demand curves below the equilibrium is the “bondholder surplus.” This arises because rates are driven up to the point where the marginal investor’s tax rate is equal to the corporate rate, but all investors can get the same rate in the market. A crucial assumption in Miller is the inability to perform tax arbitrage: selling assets taxed at a high rate to buy those taxed at a lower rate. Clienteles may arise because of differences in tax treatment of various organizational forms and differences in transaction costs [Shin and Stulz (1996)].
DeAngelo & Masulis (1980) ? generalize Miller (1977a) to include more realistic taxes, bankruptcy, and agency costs. In this “modified balancing theory” the full burden of bankruptcy or lending costs is not necessarily borne by the debtors. Some of these costs are shifted to bond buyers in the form of lower risk-adjusted interest rates. Miller’s irrelevance result is shown to be extremely fragile. When either non-debt tax shields or bankruptcy/agency costs create an increasing marinal cost individual firms do have an optimal capital structure. The single period1 model allows different tax rates for each investor, so long as the ordinary income tax rate is higher than the capital gains rate. All firms face the same marginal tax rates. The set up results in three tax brackets: those who prefer debt, those who prefer equity, and those who are indifferent. 1
A multiperiod model with tax carryforwards, etc. would be qualitatively similar.
82
CHAPTER 5. CORPORATE FINANCE For the marginal investor µ (1 − τBµ ) (1 − τBµ )π(s) (1 − τEµ ) (1 − τEµ )π(s) = ∀ s. = = PB (s) PE (s) P¯B P¯E
Non-debt tax shields such as depreciation are given by ∆, Γ are the dollar amount of tax credits, and θ represents the maximum fraction of the tax liability that can be shielded by tax credits. There are four outcomes that result from different states. Debt X(s) B B B
Equity 0 X(s) − B X(s) − B − τC [X(s) − B − ∆](1 − θ) X(s) − B − τC [X(s) − B − ∆] + Γ
State [0, s1 ] [s1 , s2 ] [s2 , s3 ] [s3 , s¯]
For all states up to s3 the firm loses some of its tax shields, R even though it may not be in bankruptcy. The value of the firm is given by S B(s)+E(s)ds. In Miller’s world, ∆ = Γ = 0 so s1 = s2 = s3 . Taking the partial wrt B, P¯B = P¯C (1 − τC ). The interpretation of the flat section of the supply curve is that all tax shields are fully utilized in all states of nature. The curve begins to slope to compensate the firm for some of these tax shields going unutilized in some states. In the new equilibrium, the net tax advantages of debt are equated with the expected default costs (1 − τB ) − (1 − τC )(1 − τS ) = E[default and agency costs]. Firms with low earnings may lose some of the value of their tax shields. The incremental value of interest tax shields decreases as firms increase leverage, implying a negative slope for the supply curve of taxable corporate bonds. This is depicted as S1 in Figure 5.1 Adding leverage-related deadweight costs d will cause the tax advantage of corporate borrowing to become more significant. At the margin, the deadweight cost per dollar of borrowing, d∗ is the same for all firms. The new supply curve S2 has a more negative slope because of the deadweight costs. This reduces the level of aggregate borrowing and the equilibrium risk-adjusted rate of return. Leverage-related deadweight costs increase the marginal tax advantage of borrowing because they decrease the supply of bonds, eliminating some of the “bondholder surplus.”
5.4. CAPITAL STRUCTURE
83
The existence of an optimal capital structure in this setting is essentially an empirical issue. Do deadweight costs and underutilization of tax shields have significant impacts on the rate of return to bondholders? There is evidence that deadweight costs and possible underutilization of tax shields are sufficiently significant to affect bond pricing. Evidence implies that leveragerelated costs reduce the supply of corporate bonds and lower the cost of borrowing, generating a positive net tax advantage of corporate debt. The theory also implies that firms that reach d∗ faster than others will have less leverage. In other words, firms that are more likely to encounter financial distress at a given debt ratio are less likely to borrow. Supportive evidence shows that there is a significant negative relation between observed leverage measures and historical failure rates. The probability of financial distress is also positively related to the variability of operating earnings. In sum, the evidence is consistent with the generalized balancing theory. Myers (1977) In Myers (1977) the firm is viewed as a collection of assets in place and growth opportunities. Risky debt reduces the value of the real options, an agency cost. This cost arises either from a suboptimal underinvestment strategy or from the costs of avoiding underinvestment. This underinvestment results even when managers are acting in shareholders’ best interest. The level of borrowing is inversely related to the relative size of the growth opportunities and is determined by the tradeoff between these costs and the tax benefits of debt. The shareholders absorb the costs of avoiding underinvestment, which include: • Rewrite/renegotiate debt contract • Shorten maturity prior to “exercise date” • Mediation • Dividend restrictions • Reputation effects • Monitoring The basic analysis considers the value of a firm facing an investment opportunity requiring an investment I and paying V (s). A firm with risky debt P will issue take the project if V (s) ≥ I + P . The analysis can be extended to a multiperiod setting Vt = VE,t + VD,t .
84
CHAPTER 5. CORPORATE FINANCE
The firm will invest as long as the incremental benefit dVE /dI = dV /dI − dVD /dI > 1. If the value of debt depends on the volatility of the firm value, then the transfer of value from equity to debt is dV /dI − dVE /dI = dV /dI · ∂f /∂V + ∂f /∂σ 2 · ∂σ 2 /∂I > 0. In conclusion, Myers’ work indicates that assets in place can support more debt than growth opportunities can, capital intensive businesses with high operating leverage can support more debt, and more profitable firms should have more debt. This logic is similar to Shleifer and Vishny (1992) who say more liquid assets can support more debt. Masulis (1980) Masulis (1980) examines the valuation effects of capital structure changes on security value. The sample of intrafirm exchange offers and recapitalizations abstracts from asset changes that accompany many other changes in capital structure. The types of transactions considered include issuing debt for equity (E → D), preferred for equity (E → P ), and debt for preferred (P → D). There are three primary sources of valuation effects. Tax-related stories predict changes in equity value to be positively related to increases in debt. Bankruptcy and reorganization expenses should cause a negative relation between equity value and leverage increases. Wealth redistribution from agency costs are a zero sum game, so gains to one group of security holders are at the expense of another group. Two other theories that are not considered are signaling and the offer premium hypothesis. The methodology employed uses comparison period returns. This approach essentially calculates the abnormal return for a security as the deviation from the mean return over a comparison period. These abnormal returns are averaged across all securities to get a portfolio abnormal return. The results are largely consistent with the tax and wealth redistribution effects, but provide little evidence about the bankruptcy costs. Leverage increasing transactions tend to increase equity value, while leverage decreasing transactions tend to decrease shareholder value.
5.4. CAPITAL STRUCTURE
85
Table 5.1: Predictions in Masulis Source Tax Bankruptcy WR: E WR: P WR: D
E→D + – + – –
E→P 0 0 + – –/0
P →D + – –/0 + –
Table 5.2: Predictions in Titman & Wessels Attribute Collateral Value Non-debt tax shield Growth Uniqueness Industrial Size Volatility Profitability
Pred. + – – – + + – –
Significant Results
Yes Small = ST MV measure
Titman & Wessels (1988) Titman and Wessels (1988) expand the range of capital structure theories tested and attempt to overcome some econometric problems. The paper uses a factor analytic technique, similar to some APT tests, to relate unobservable attributes to capital structure measures using observable data. The process involves estimating a measurement model and a structural model simultaneously. Although theoretically appealing, implementation requires imposing a number of restrictions on the loading matrix in the measurement model. The authors use six D/E ratios as dependent variables obtained from all combinations of long-term, short-term, and convertible debt to book and market equity, {LT, ST, Conv}/{BE, M E}. The explanatory attributes are summarized in Table 5.2. The results indicate that uniqueness is important. The authors believe
86
CHAPTER 5. CORPORATE FINANCE
this supports the costs of financial distress, but the proxies may also be related to non-debt tax shields and collateral value. The size effect for small firms is taken as evidence that transaction costs may be important. The analysis is unable to explain the cross-sectional variation in convertible debt. The lack of evidence in many cases may be due to problems with the measurement model. Rajan & Zingales (1995) The purpose of the Rajan and Zingales (1995) paper is to see if factors determined to be important in determining capital structure in the U.S. are also important in other countries. This research is valuable because many of the theories that explain capital structure were developed in response to empirical observations. The paper studies the G-7 countries: U.S., U.K., Canada, Japan, Germany, France, and Italy. There are a few limitations to the analysis. First, there is a bias towards large, listed companies. Second, there are variations in industry concentrations across countries. Third, there are differences in financial statements and reporting across countries. Finally, bank- versus market-oriented economies may produce systematic differences. The primary analysis in the paper relates four factors to leverage, which is measured in both book and market terms. The factors are tangibility, M/B, size, and profitability. A long list of control variables are also included. The authors also look at the distribution of wealth transfers out of the firm and find that these payments are generally made through the most taxadvantaged route. The general results indicate that U.K. and German firms tend to have lower leverage than firms in the U.S. The factors generally have the hypothesized relation with leverage. Tangibility is positively related, M/B negatively related. Size is positively related except for in Germany where it is negatively related. Profitability is negatively related, except in Germany and France [this is opposite the predictions of Ross (1977a) and Myers (1977)]. Graham (1996) Graham (1996) is the first paper to take a careful look at the role of marginal taxes in the capital structure decision. Economic theory indicates marginal rates are what matter, but previous studies have used statutory rates as a
5.5. DIVIDENDS
87
matter of convenience. The approach for estimating marginal rates is to calculate the present value of current and future taxes on a $1 increase in income based on simulations. The main analysis regresses (D1 − D0 )/D0 on the marginal tax rate, relative cost of debt, probability of bankruptcy, non-debt tax shields, and a list of control variables. The results find that the marginal tax rate is important in explaining capital structure. The difference between statutory and marginal tax rates is also important, providing evidence that firms still use it in the capital structure decision. Firms with volatile tax rates tend to use more debt as expected with a progressive tax schedule. The relative cost of debt has the wrong sign, but there may be a multicollinearity problem.
5.5
Dividends
Developing a model of dividend policy consistent with firms maximizing profits and individuals maximizing utility has been a challenge. MM moved the thinking away from the view that more dividends were better. Dividend irrelevence in perfect markets is based on the idea of replicating any desired payoff by buying/selling shares. Transactions costs remove the ability of individuals to make home made dividends. There may be clienteles that prefer dividends. There are also behavioral arguments, market timing stories, and institutional constraints (“prudent man” rules). Stylized Facts: • Corporations payout a significant portion of earnings as dividends. • Dividends have been the predominant form of payout. • Individuals in high tax brackets receive substantial dividends. • Corporations smooth dividends. • Market reactions are positively correlated with dividend changes. Black (1976) presents arguments for and against dividends as the “dividend puzzle.” A firm may choose to pay dividends to provide a return expected by investors, even though this may be irrational. With transactions costs, dividends may be a better way to distribute wealth to shareholders than selling a few shares. The dividends may be used to signal information, such as higher expected future earnings. Finally, dividends could be used to expropriate wealth from bondholders. Reasons not to pay dividends include
88
CHAPTER 5. CORPORATE FINANCE
tax avoidance, investment in growth opportunities, and the pecking order argument.
5.5.1
Factors Influencing Dividend Policy
Dividends and Taxes Since capital gains taxes are typically lower than the tax on dividends, and capital gains can be deferred, there is a general tax disadvantage to dividends. This advantage may vary over investor types (low tax-rate individuals, corporations, tax-exempt institutions). The price drop on the ex-date has been well-documented to be less than the dividend amount. The average premium increases with the dividend yield, consistent with the tax clientele hypothesis. There is also evidence of abnormal volume around the ex-date, indicating there is not a (perfect) tax clientele. Signaling with Dividends Signaling implications that have been tested empirically include (i) dividend changes should be followed by subsequent earnings changes in the same direction, (ii) unanticipated changes in dividends should be followed by revisions in the market’s expectation of future earnings, (iii) unanticipated dividend changes should be accompanied by stock price changes in the same direction. There is only weak evidence that dividend changes convey information about future earnings. There is evidence that earnings forecast revisions are positively related to both dividend changes and the market reaction (causality), consistent with the signaling hypothesis. There is fairly strong evidence of a positive relation between market reaction and dividend changes. Agency Costs and Dividends Expropriation of bondholders may come in the form of dividend payments. Under this hypothesis, equity increases in value with a payout, while debt loses value. Under the alternative that dividends signal good news, both debt and equity should increase in value. There is evidence that bond prices drop significantly with dividend decreases, but does not change significantly at an increase. This is consistent with the information content explanation. Dividends also reduce the free-cash flow problem of Jensen (1986). In sum,
5.5. DIVIDENDS
89
there is weak empirical support for the informational content of dividends, and practically no support for dividends as a solution to agency problems.
5.5.2
Key Dividends Papers
Miller & Rock (1985) Miller and Rock (1985) develop a model where firms use dividends to signal their quality in a setting where there is an information asymmetry about current earnings. The model has two periods. At time zero the firm invests in a project whose profitability is unobservable by investors. The project produces earnings at time one, which the firm uses to finance the dividend and new investment. The project produces additional earnings at time two, which are correlated with time one earnings. Good firms pay a level of dividends sufficiently high to make it unattractive for bad firms to copy them. Costs arise from the distortion in the investment decision. Dividends provide information about earnings through the sources and uses of funds identity. This model does not say why firms use dividends rather than repurchases. • Outsiders can not observe the cash flows. • All firms have identical investments with diminishing marginal rates of return. • External financing is done only with riskless debt. • All dividends and capital gains are taxed at τ . • Dividends and repurchases are perfect substitutes. • Firms signal by distributing cash and altering their investments. • Good firms are able to distribute more cash and still match investments of bad firms. • Bad firms can not afford to mimic the good firms because they would have to forgo projects with relatively high marginal returns. • The equilibrium has deadweight costs relative to the perfect information case. In this two period model a firm has a concave investment technology F (I) ˜ t+1 = and makes investments It at t = {0, 1} that generate random earning X F (It ) + ε˜t+1 . The errors are unconditionally mean zero, but E[˜ ε2 |ε1 ] = γε1 . The sources and uses of funds identity requires ˜ 1 + B1 , I1 + D 1 = X where B1 is additional financing and D1 the dividend.
90
CHAPTER 5. CORPORATE FINANCE At time 1 the value of the shares is V1 = D1 − B1 + [F (I1 ) + γε1 ]/(1 + r).
(5.1)
The firm maximizes value by choosing I1 , D1 and B1 subject to the sources/uses constraint. Substituting for the net dividend, the FOC is F 0 (I1∗ ) = 1 + r. The earnings announcement effect is h i γ γ ˜ V1 − E0 [V1 ] = ε1 1 + = X 1 − E 0 [X 1 ] 1 + . (5.2) 1+r 1+r The difference between actual and expected dividends is ˜ 1 ] = ε1 (D1 − B1 ) − E0 [D1 − B1 ] = X1 − E0 [X and the dividend announcement effect is the same as (5.2). The dividend announcement reveals information about current earnings, which in turn are useful for predicting future earnings. There are two components to the announcement effect. The first is a dollar for dollar reaction to the dividend surprise. The second is the discounted future change arising from the persistence parameter. In this model, earnings announcements shortly after the net dividend announcement should not contain any new information. In practice such earnings announcements do appear to be informative. This is because they contain information on outside financing, which is not part of the gross dividend. Financing announcement effects are similar to dividend announcements, but with the sign reversed. With intermediate trading, optimal policies are inconsistent because a firm could pay a higher dividend by forgoing investments and raise the stock price. A solution to this problem is underinvestment. The informational asymmetry is that at time 1 the market knows the initial investment and the first dividend, while the directors also know the cashflow and investment. That is, Ωm = {I0 , D1 } and Ωd = {I0 , D1 , I1 , X1 , B1 , ε1 }. As a result, the directors and market have different valuations of the firm. The directors value the firm according to (5.1). The market can only use its information in the valuation V1 = D1 − B1 + E1m [F (I1 ) + γ ε˜1 |Ωm ]/(1 + r). The managers choose the net dividend and investment to maximize the weighted average of the two valuations subject to the sources/uses constraint.
5.5. DIVIDENDS
91
The weights are the fraction owned by selling stockholders and the fraction retained. The public can use its information about the net dividend to infer the earnings for which the dividend is optimal. Although there are an infinite number of informationally consistent valuation schedules, one Pareto dominates the others. A firm with the lowest earnings will choose the same net dividend and investment level as in the full information case, giving a boundary condition. The solution to an ODE satisfying the maximization problem has all net dividends at least as large as the optimal level. Higher dividends serve as a signal of higher current earnings. The better firms are able to pay out a higher dividend and forgo productive investments. Since the investment technology is concave, forgoing projects has a higher marginal cost for the lower quality firms. This separating equilibrium restores consistency, but at the expense of underinvesting. There is some empirical evidence supporting the validity of dividends as signals. Examples include Vermaelen (1981) and Prabhala (1993), but ? do not find supportive evidence. Since the Miller and Rock (1985) model is in response to the observation that unexpected dividend changes are positively related to stock price changes, the one would expect to find some supportive evidence. Prabhala (1993) Prabhala (1993) presents a framework where dividends serve as a signal of the quality of investment opportunities. This comes in response to earlier literature where Tobin’s q and dividend yield are claimed to explain stock price reactions to dividend announcements arising from agency costs of free cashflows and the existence of dividend clienteles. This same evidence is consistent with a signaling model which subsumes the importance of the other effects. The motivation for the signaling interpretation is that the other interpretations are inconsistent with rational expectations. Since q, dividend yield, firm value, and stock price are useful in predicting dividends, they should be used in making optimal forecasts. The alternative interpretations depend on dividend changes being unanticipated. This model can be viewed as an extension of Miller and Rock (1985), where the information asymmetry now relates to the quality of growth opportunities θ. A larger net dividend gets a higher market price at t = 1, but reduces investment and the cashflow at t = 2 which is distributed to the
92
CHAPTER 5. CORPORATE FINANCE
remaining (1 − k) shareholders. Since signal costs decrease with firm type, the better firms can afford to signal more than the lower quality firms. The dividend yield effect has been interpreted as evidence supporting the existence of clienteles. Evidence that high-yield firms experience larger announcement effects is consistent with this argument. Prabhala reinterprets this evidence in a signaling framework where dividends are more informative about growth opportunities for high-yield firms. Also, these firms are less likely to have strong growth prospects so dividend increases are less likely. Prior studies show dividend increase announcement effects for low q firms are larger than for the high q firms, consistent with the free cashflow hypothesis. This is because a reduction in FCF is more valuable for firms which tend to squander cash. The signaling interpretation reflects the market’s expectations: high q firms are more likely to have better growth prospects and are more likely to increase dividends so dividend increases result in smaller announcement effects. The empirical methodology estimates a dividend forecast, then examines whether the deviation from the forecast explains price changes. The explanatory variables used to forecast the dividend are long-term dividend yield, q, firm value, stock price, the difference in long- and short-term yields, and stock volatility. Announcement effects are then regressed on dividend surprises. Results indicate a positive relation between dividend surprise and announcement effect, and dividends are more informative signals for high-yield firms. Tobin’s q has limited marginal benefit beyond the dividend surprise. There is little evidence of the agency or clientele effects after controlling for the signaling effect, although these former hypotheses are not explicitly rejected. Vermaelen (1981) Vermaelen (1981) examines the price behavior of securities when firms repurchase shares in a tender offer or on the open market. This allows testing the importance of information/signaling, personal taxes, corporate taxes, and bondholder expropriation. Repurchases serve as a signal of firm value since managers’ ownership, etc. creates an incentive to increase stock price by announcing a tender offer. Repurchasing shares above their true value will dilute the value of the managers’ holdings. But with positive information, the manager may be willing to pursue a tender offer. The more valuable the information, the lower the marginal cost to buying back large fractions, offering a higher premium,
5.5. DIVIDENDS
93
and holding more shares in the firm. The price during the offer is given by PA = αPT + (1 − α)P¯E with α being the fraction purchased to the fraction tendered. Vermaelen finds that repurchase announcements are followed by a permanent increase in stock price. Signaling seems to be the predominant influence. There is no evidence of wealth expropriation from bondholders or tendering shareholders. Those that do not tender are worse off than those that do, but they are better off than before. The results are also inconclusive with respect to the leverage and personal tax hypotheses. Open market transactions are associated with a negative CAR prior to the announcement, followed by a an abnormal return of roughly 2% around the announcement. Tender offers exhibit a flat CAR prior to announcement, but an abnormal return on the order of 15% around the announcement. Following the announcement, the tender offers have a decline the CAR, which is consistent with the expiration of some of the offers. Looking specifically at the expiration of offers, there is a negative abnormal return. The abnormal return to shareholders, IN F O, is regressed on a number of signaling variables to test this hypothesis. IN F O is defined as I/(N0 P0 ) = (1 − FP )(PE0 − P0 )/P0 + FP (PT − P0 )/P0 , the weighted average of the return to tendered and non-tendered shares. The results are consistent with the signaling hypothesis. The size of the offer premium, target fraction, managerial ownership, and subscription level are all positively related to the value of information. DeAngelo, DeAngelo & Skinner (1996) ? provide another test of dividends as signals. They identify stocks with a history of growth followed by a decline in earnings and examine the dividend policy before and after the decline. They find that dividends are not reliable signals of future earnings. The results could be due to overoptimistic managers who “oversignal,” the relatively small cash commitment of a dividend may undermine its credibility as a signal, or signaling based on imperfect information. At the year 0 dividend decision, 68% of the firms increase dividends while only 1% decrease dividends. There is no evidence of positive earnings surprises among these firms over the next three years, and some evidence of
94
CHAPTER 5. CORPORATE FINANCE
negative surprises. Dividend increases cause small abnormal returns at the announcement, but over the course of the year the firms have large negative abnormal returns. The dividend increasing firms have a less negative abnormal return than the decreasers, which suggests that managers may be able to prop up the stock price with a dividend increase.
Eades, Hess & Kim (1994) Eades, Hess, and Kim (1994) examine the time series of ex-dividend day pricing and identify variation due to tax effects, strategic short-term trading (dividend capturing), and business cycle effects. They find the variability in pricing is positively correlated with dividend yield and dividend pricing is countercyclical. Dividend capturing reduces ex-date returns and depends on transactions costs, interest rates, and dividend yield. The methodology forms ex-date portfolios on each calendar date. Standardized excess portfolio returns (SER) are the ex-date portfolio return (including the dividend) less the average non-ex-date portfolio return, divided by the estimated portfolio standard deviation. The portfolios are further subdivided into high-yield and low-yield portfolios. The SER of the low-yield portfolio is always positive, has relatively low variation, and zero to negative autocorrelation. The SER for the high-yield portfolio changes from positive to negative, is more volatile, and exhibits high positive autocorrelation. The tax effect hypothesis is tested by including dummy variables for different tax regimes in an ARIMA model. There is little evidence of a tax effect. The test of the dividend capturing hypothesis includes a dummy for the introduction of negotiated commissions. This lowers transactions costs and makes it easier for corporations to perform tax arbitrage. The dummy is significantly negative, especially for the high-yield firms. This is consistent with the dividend capturing hypothesis. The dividend capturing hypothesis also predicts dividend capturing is negative related to T-bill yields and positively related to dividend yields. The evidence also supports these predictions. Analysis of the business cycle effects indicate that low-yield firms are valued countercyclically (procyclical ex-date returns). The high-yield firms do not exhibit this pattern because the dividend capturing effects work in an offsetting direction.
5.6. CORPORATE CONTROL
5.6
95
Corporate Control
Manne (1965) The Manne (1965) paper is the first to introduce the idea of a market for corporate control. For the market for corporate control to be effective there must be a high positive correlation between managerial efficiency and share price. Takeovers lead to competitive efficiency among managers and are more efficient than bankruptcy. They allow increased mobility of capital which provides more efficient allocation of resources. Corporate control may be transferred through a proxy contest, direct share purchases, or mergers. Proxy contests are the most expensive, most uncertain, and least used. This method tends to be used when the issue is over compensation not managers’ policies. Proxy contests are more likely with disperse share ownership. The share price generally rises on the announcement. Direct share purchases may be open market purchases, direct purchases of blocks from large owners, or tender offers. With lower ownership concentration other shareholders are more likely to participate in the premium and outsiders are willing to pay less for control. Mergers typically offer cost advantages over the other methods. In a merger the manager’s interest are generally in line with the owner’s. The main exception is that managers do not have an incentive to buy managerial services as cheaply as possible. When incumbent managers recommend a merger there are likely to be side payments. Within an industry mergers may be an alternative to bankruptcy. These mergers typically reduce the information gap between the target and bidder.
Shleifer & Vishny (1986) Shleifer and Vishny (1986) examine the role of large shareholders as monitors and the ways in which they bring about improvements in corporate policy. They basic idea is that someone needs to monitor the managers, but it is too expensive for small owners to do so. Large shareholders are better able to bear the monitoring costs and will do so when it is in their best interest. In the model the large shareholder L has a probability I of getting a value improvement Z above q from a probabilty distribution F (Z) for a cost C(I). The large shareholder begins with α shares so he needs an additional .5 − α
96
CHAPTER 5. CORPORATE FINANCE
to attain control. If he invests C(I) he will bid q + π where .5Z − (.5 − α)π − cT ≥ 0
(5.3)
and cT represents the costs of making the bid. The small shareholders will tender if π − E[Z|Z ≥ (1 − 2α)π + 2cT ] ≥ 0. Let π ∗ (α) and I ∗ (α) be the optimal amounts, and Z c (α) be the cutoff value at which L is indifferent about taking over. There are a number of important results. First, the premium decreases in L’s stake, π ∗0 (α) ≤ 0. Second, a larger initial stake permits takeovers for smaller improvements, Z c0 (α) < 0. Third, with a larger stake L invests more in monitoring, I ∗0 (α) > 0. Next, the expected increase in firm profits rises with α, given L has an improvement. Therefore, an increase in α decreases the premium but increases the market value of the firm. Increasing cT will increase the takeover premium but decrease the market value of the firm. There is not an equilibrium where L attains more than the amount necessary for control, say 50%. This is because the small shareholders will infer that L is trying to profit at their expense. “Jawboning” is an alternative to a takeover. Essential L uses his size as a threat of takeover. The managers may then be willing to negotiate and make some of the changes L seeks. This method can be incorporated into the above analysis by including the condition that (5.3) be greater than αβZ, where β is the proportion of the potential value gain attainable through negotiation. Jawboning will typically be used for making less valuable improvements since the costs are typically lower. As before, the value of the firm increases with α, but now the option to jawbone can actually make the larger shareholder worse off. This is because the the required bid on the takeovers rises. Small shareholders can be worse off as well since takeovers are typically more valuable to them than private negotiation. Assembling a large block is a complicated problem. If L can accumulate a position anonamously he can deprive small shareholders from their gains from his larger holding. If L trades publicly small shareholders will bid the price up to reflect the potential value gain. This makes it expensive for L to get his position. He will want to increase his position again to offset these additional costs. But the small shareholders will see this and holdout from selling the first time. Similarly, L will never fragment his stake because
5.6. CORPORATE CONTROL
97
doing so reduces the value of his remaining shares since there will be less monitoring. Assembling a block is a one-way proposition. It is expensive to do, so once done it should not be undone. Large blocks should be sold intact to preserve the value of monitoring. Dividends may provide the compensation to L necessary to get him to assemble a block. Large shareholders are typically corporations who enjoy tax benefits on dividend income. Dividends are a sort of bribe from the small shareholders to the large to get them to serve as monitors. Stulz (1988) Stulz (1988) shows that managements’ voting power is important in determining capital structure. For small α, ∂V /∂α > 0, for large α, ∂V /∂α < 0. The intuition is that the premium offered in a takeover increases with α, but the probability of an offer falls. When α is too high, it is beneficial to make a takeover less costly to managment with a golden parachute, for example. There is no benefit in this model to the manager holding the controlling interest. He will be able to block any takeover in this case. This implies α∗ ∈ [0, 1/2). The conflict of interest in the model arises from the fact that successful tender offers affect the wealth of outside shareholders and managers differently. These results are demonstrated in a single period model where the manager owns α of an all equity firm. At the beginning of the period there is homogeneous information and a bidder decides if he wants to get information on the target. He pays I for information delivered at the end of the period. The bidder will bid for half the shares a price of the no-bid value plus a premium on all the shares, y/2 + P . All the benefits of the value increase go to the target. The probability of a successful offer depends on the likelihood shareholders’ tax rates are low enough to accept the bid and the fraction of outsiders needed to make the offer a success. The bidder chooses P to maximize the difference between the gain and the premium times the probability of making a successful bid. With α > 0, the bidder has to persuade 1 z(α) = 2(1−α) > 1/2 of the outsiders to tender. Increasing α decreases the probability of a successful bid so the bidder’s expected value falls as well. For the bidder the optimal premium increases with α. Allowing the manager to tender preserves the general results. More riskaverse managers will hold less shares since they are risky. With DARA preferences, α will increase with the manager’s wealth. Managers with greater
98
CHAPTER 5. CORPORATE FINANCE V MSV JM
Stulz
α Figure 5.2: Managerial Ownership and Firm Value benefits from control will hold more share to protect their interests. Managers will also hold more shares when the sensitivity of offer success to changes in ownership is large. Due to risk aversion and budget constraints, managers typically hold only a small portion of the shares. Alternatives that increase (some) their voting power can increase firm value. Changing the debt ratio or repurchasing shares will increase α. Convertible debt and delayed conversion can also help since conversion will decrease α. By changing the requirements for control a super-majority rule or differential voting rights effectively increase α. The manager may also have voting power over shares he does not own. In ESOPs and pensions the manager is often the trustee. A standstill agreement gives the manager voting power over a large shareholder’s position but may also effectively eliminate a bidder. Morck, Shleifer & Vishny (1988) Morck, Shleifer, and Vishny (1988) offer an empirical test of the effect of managerial ownership on firm value. High managerial ownership may be good because it aligns the incentives of the managers with the shareholders. However, too much managerial ownership may be bad because the manager may become entrenched. The analysis studies the relation between Tobin’s q and managerial ownership after controlling for intangible assets, tax shields, size, and industry. More specific tests distinguish between insider and outsider
5.6. CORPORATE CONTROL
99
ownership and connections to founding families. In the study management is defined as the board of directors. Tobin’s q is regressed on measures of growth oppportunities (R&D/A, Adv./A), tax shields/capital structure (D/A), size (A), industry dummies, and dummies for board ownership. The ownership dummies indicate ownership up to 5%, from 5% to 25%, and above 25%. The results indicate that there is a positive relation between ownership and firm value at very low and very high levels of ownership, and a negative relation in between. The explanation is that the incentive alignment effect is always present, but the entrenchment effect is not important until ownership is sufficicently high. Also, managers become fully entrenched at some point, while the incentive effect continues to increase. The results are robust to different ownership breakpoints and measures of firm value. Ananlysis of the board composition indicates that outsiders are slightly better monitors but still become entrenched. Close connection to the founding family increases value in new firms but decreases value in old firms. The main results are depicted graphically in Figure 5.2 (not drawn to scale). These results are consistent with a combination of the predictions in Jensen and Meckling (1976) and Stulz (1988). These results may be partly due to the fact that managers in high q firms are more likely to have more stock. This is likely to be especially important in the low ownership range and can induce a spurious correlation between ownership and q. Cotter, Shivdasani & Zenner (1996) ? examine the effect that outside directors have on the value of target shareholders. This is a situation where the board should be particularly important. Insiders on the board may have incentives that are different than outsiders. The results indicate that target shareholder gains are 20% larger when the board is independent. The value comes at the expense of the bidder shareholders. Outsiders are associated with higher initial bids and greater offer revisions. With an independent board, defense mechanisms such as poison pills enhance shareholder returns rather than entrench managers. Target gains are negatively related to interlocking boards and positively related to ownership of insiders. The study examines the impact of board composition on the initial premium, premium revisions, and target shareholder gains. Board members are classified as independent, insiders, or gray. Control variables include size,
100
CHAPTER 5. CORPORATE FINANCE
poison pills, golden parachutes, managerial ownership, block ownership, and performance.
5.7
Mergers and Acquisitions
There are many potential benefits to mergers and acquisitions. These takeovers can remove inefficient management, achieve economies of scale, or generate synergies. Offsetting these benefits are costs such as wealth redistributions and reduced efficiency may arise due to the numerous conflicts of interest and informational asymmetries. Stylized Facts • target SH earn large positive AR and negative AR on failure • bidding SH earn zero to negative AR • multiple bidder contests magnify AR • bidder AR were lower in 1980’s than before • joint MV increases on average • success is highly uncertain and positively related to bid premium and toehold • defensive measures reduce probability of success • target reaction to defensive measures and greenmail is negative • large target mgt. share increases bid premium • prob. of hostile takeover lower with high target D/E • bid revisions are large jumps • puzzlingly low toeholds • mixed evidence about means of payment
5.7.1
Tender Offers
Tender offers can be either conditional or unconditional on attaining a critical level of participation. Target SH are more likely to tender if he thinks the post-takeover value is low and if he thinks he is pivotal. Shareholders have an incentive not to tender if he thinks the post-takeover value is high. He can let others tender so the takeover succeeds, giving him much of the benefit. Complete Information With complete information about the future value, no shareholders will tender for less than the future value; all of the potential benefits go to the target
5.7. MERGERS AND ACQUISITIONS
101
shareholders and none to the bidder. The bidder may be able to make a profit by diluting the value of minority shares after the takeover. The threat of this dilution may induce the target SH to tender at a price less than the full future value. If the bidder has a toehold he will also be able to profit even without dilution. In practice the gains on the toehold are not likely to be important since toeholds are typically small. A bidder may be able to threaten the target SH in other ways to get them to tender as well. One example is to threaten to enter the target’s market and compete with them, thereby reducing the value of the target. Incomplete Information A bidder may have a better idea about the future value than the target. Under rational expectations, targets know that bidders will try to use their superior information to under-bid. In equilibrium, the free-rider problem remains and bidders will still refuse to tender. There are two types of equilibrium, one where offers are uninformative, the other where the offer provides information. This problem was originally studied in Grossman & Hart (). The twotiered tender offer and dilution of holdout shares are potential solutions to the problem. The difference is the type of signaling possible in a two-tiered bid, which allows separation of the signals for undervaluation and private synergies. This type of offer can eliminate the incentive to free-ride without voluntary dilution. Another approach is to solve the free-rider problem by allowing the individuals realize the effect their action has on the outcome. This is a rational, but not a competitive, outcome. In Shleifer and Vishny (1986) there are incentives in the form of dividends for large shareholders to monitor the managers. This increases the value of the shares for all shareholders, including the small shareholders. The intention of acquiring a large block also raises the share price, making it more costly to acquire the block. The dividend incentive argument, which presumes the large shareholders are corporations, is not well-supported empirically (cite ??). A low bid may signal that the expected improvement is small. Since bidders with high potential improvements have a stronger incentive to bid high, a low bid is a credible signal. The probability of an offer’s success increases with the bid premium and size of the toehold and decreases with the number of additional shares needed for control.
102
CHAPTER 5. CORPORATE FINANCE
Defensive Actions Defensive actions may reduce shareholder value if it blocks a potentially good takeover, but the may also improve value by increasing the incentive to bid high and encouraging other bids. Some actions, such as the poison pill reduce the incentive to bid high. In general, strategies that impose greater costs on the bidder when the offer succeeds than when it fails reduce the incentive to bid high. In summary, some defensive measures are in the shareholders’ best interest, while others are used to create private benefits for the manager. More subtly, a defensive measure may change the informational asymmetry. Decreasing the importance of publicly known improvements decrease the probability of success. Another defensive measure is to signal to the target shareholder that their shares are undervalued, in which cases they are less willing to tender. Target management may do this by increasing leverage and/or repurchasing shares. There may also be reputational effects to consider. Pivotal Shareholders A pivotal shareholder is more likely to tender than a non-pivotal one. A large blockholder is much more likely to be pivotal than a small investor. The ability of a bidder to revise a bid becomes very important with pivotal shareholders. Means of Payment The means of payment has important consequences for the information revealed by the bidder. Offering equity may indicate that the bidder’s shares are overvalued [Myers and Majluf (1984)]. An offer of cash may signal high value. Cash offers create an adverse selection problem for the bidder. Offering equity can reduce the risk of overpayment by making the terms of the offer contingent on the target’s value. The target shares in gains or losses so it will tend to reject the transactions that are likely to be undesirable. Finally, there are tax advantages to using at least 50% equity financing. With no private information on the part of the target, the target can increase the auction price with equity. If the target has private information about the synergy, the bidder could benefit by conditioning the target’s acceptance on its value.
5.7. MERGERS AND ACQUISITIONS
5.7.2
103
Competition Among Bidders
In the English auction model of bidding, bidders trade incremental bids until the bidders with the lowest valuations drop out. The winning bid will be just above the second highest valuation. A very important assumption is that bids can be costlessly revised and resubmitted. If bids are costless to submit but there is an investigation cost, the bidder’s strategy will change. Now he will want to submit a large initial bid to avoid a costly bidding contest. The high initial pre-emptive bid does not deter other bidders directly by requiring other bidders to improve, but rather it is a signal that the initial bidder has a high valuation and reduce the probability of additional bidders. All the bidders that decide to investigate will then enter into the English auction. If bids are costly to submit, then the revised bids will move in large steps. Management may undertake activities to discriminate among bidders. In general, exclusion of bidders is viewed as bad for target shareholder. There are some reasons that these defensive measures may be good. For example, target management may reject a bid if the target firm is worth more, or if it is likely that other bids will come. Other measures may be repurchasing shares to make a takeover more difficult, or removing the incentive for the takeover by fixing existing problems. Removing bidders may increase ex ante the frequency of bidding competition. It is optimal to pay greenmail only if there is no white knight.
5.7.3
Managerial Power
Managers have potentially conflicting interests of maximizing shareholder value and looking out for their own best interest. Increasing target debt levels may be a way of reducing some of the agency costs that may be the impetus for the takeover. High leverage may also allow the target to capture a greater fraction of the bidder’s improvements. Shifts in debt levels can also affect management’s voting power and gains from change in control. If the supply of shares in upward sloping, bidders must offer larger premiums. This causes the level of the bid to increase with the manager’s share ownership.
104
5.7.4
CHAPTER 5. CORPORATE FINANCE
Key Papers
Roll (1986) The hybris hypothesis of Roll (1986) suggests that takeovers occur because managers overestimate their own abilities. Under this hypothesis the gains to takeovers are small or non-existent. This explanation is not inconsistent with stong-form efficiency, whereas other explanations require at least temporary inefficiency. There is some evidence that is generally consistent with this idea. Since the current market price is a lower bound on bids, only bids with relatively high valuations are observed. Thus takeovers attempts are likely to contain random overvaluation errors. Since these transactions are driven by a relatively small number of people and depend heavily on individual decisions irrational behavior is more likely. The theory predicts that the bidding firm will have a price decline on the announcement followed by a further decline on winning or an increase on losing the bid. The total gains to the takeover should be non-positive. Gains to the target come at the expense of the bidder and transactions costs are a deadweight loss. There should be more hubris among firms that have been successful recently. Lang, Stulz, & Walkling (1989) Lang, Stulz, and Walkling (1989) examine the variation in tender offer abnormal returns to understand the determinants of the bid premium. The results indicate benefits are greatest for high q bidders and low q targets, consistent with the Jensen (1986) FCF argument. The evidence that high q bidders profit indicates that the hubris hypothesis is not a complete description of the process. In successful tender offers the typical bidder has a low q for several years prior to the bid. The target’s q tends to have declined recently. The takeovers creating the most value are high q firms acquiring low q firms. The most value is lost when low q firms acquire high q firms. These results could also be interpreted to mean that value is created when undervalued firms are acquired and destroyed when overvalued firms are acquired. The analysis regresses gains on dummy variables for the q of the bidder and target. The q is measured as either the average over three years prior to the bid or from the most recent year. The regressions do not control for
5.7. MERGERS AND ACQUISITIONS
105
the growth opportunities, form of payment, or number of bidders. This is a problem because q can measure not only management ability, but also the growth potential of the firm. Separate regressions are performed for bidder gains, target gains, and total gains. Each are further subdivided by whether there are opposing offers. Berger & Ofek (1995) Berger and Ofek (1995) examine the effect of diversification on firm value. Diversification programs were popular in the 1950s and ’60s, but more recently firms have moved in the opposite direction. The authors compare the sum of imputed stand-alone values for a firm’s segments to the market value of the firm. The general conclusion is that diversifivation tends to reduce firm value by roughly 15%. Unrelated diversifications destroy the most value. There are many potential benefits and costs to consider in analyzing the value of diversification. There are gains in operating efficiency, increased debt capacity, reduced taxes, and efficiencies with internal capital markets. Potential agency costs such as FCF, cross-subsidation, and incentive conflicts between and among the divisions weigh against the benefits. Imputed valuations are based on the median ratio of capital to {EBIT, A, S} among single-segment firms. For each multi-segment firm the value from diversification is the log of the ratio of actual value to imputed values.2 The results indicate that multi-segment firms tend to have lower ratios. Regressions of excess value on multi-segment indicators indicate that diversification reduces value by 15%, even after controlling for size, profitability, and growth. Acquisitions of related segments tend to be less harmful than diversifying acquisitions. The results are robust to imputed measure and persist over time. In examing the sources of value gain or loss the authors consider overinvestment, cross-subsidization, and tax effects. Overinvestment, as measured by capital expenditures to assets in low-q segments, is negatively related to excess value, especially for diversified firms. The cross-subsidation effect is captured by a dummy variable for negative cashflows in a segment. Again, the effect is more negative for multi-segment firms. The evidence suggests that the tax benefits are economically insignificant. 2
This is biased towards a diversification discount since logs are not symmetric about 1. Also, the allocation of overhead and reliance on segment-level reporting may create biases in the imputed valuations.
106
CHAPTER 5. CORPORATE FINANCE
Mitchell & Lehn (1990) Mitchell and Lehn (1990) ask “Do Bad Bidders Become Good Targets?” The answer seems to be yes. The idea is to see whether takeovers discipline managers of firms that have demonstrated poor acquisition programs. This suggests that at least part of the gains to targets may be in reduced agency costs. The authors find that there is little change in value for acquisitions in general. But there is a significant decrease in value for bidders who are subsequently acquired. For all firms, the average gain for an acquisition that is later divested is smaller. This effect is especially true for firms that later become targets themselves. Finally, the probability that a firm becomes a target is inversely related to the announcement effects of its acquisition. In defining bad bidders it is important to distinguish between overpayment, which can not be fixed, and poor ongoing performance, which presumably can be fixed. Also note that most targets of hostile takeovers did not previously make an acquisition, so this is only a partial explanation. The main analysis is based on an event study methodology of abnormal bidder returns around the bid announcement. Average abnormal returns for different classifications of the bidders are compared. On average, bidders earn a negligible return, but non-targets actually earn a positive return. Subsequent targets, especially those in hostile takeovers, earn negative returns. Since the divestiture rate is higher for subsequent targets than for non-targets, it appears that the bad bidders are bad because the have poor ongoing performance. Logit regressions give evidence that firms that make bad acquisitions are more likely to get takeover offers than firms that make good acquisitions. Mitchell & Mulherin (1996) The Mitchell and Mulherin (1996) paper addresses the impact of industry shocks on the high level of restructuring in the 1980s. The hypothesis is that tender offers, mergers, and LBOs are among the lowest cost means of responding to industry change. The study is motivated by the high concentration of restructuring within industries. If these activities are driven by industry effects, the announcement of one firm in an industry should provide information about the prospects of the other firms in that industry. In this sense it is not surprising that we see poor performance following a takeover. These activities are not the cause of a problem, but rather a response to a
5.8. FINANCIAL DISTRESS
107
problem. The study is based on the roughly 1,000 Value Line firms in 1981. These firms are tracked throughout the rest of the decade and marked as to the type of takeover target they were (if at all). The analysis indicates that over the full period there is significant clustering of takeover activity at the industry level. Furthermore, within industries there is also clustering over time. Across all industries takeovers are spread fairly evenly over time. This provides evidence that takeovers are responses to industry-specific shocks. Further analysis indicates that this industry clustering was less common in the 1970s. Regressions of takeover activity on variables measuring sales and employment shock and growth indicate that it is industry change, not growth, that drives the takeovers. The findings in this paper indicate the problem of asset liquidity in Shleifer and Vishny (1992) may be important.
5.8
Financial Distress
When a firm faces financial distress a number of problems arise. Depending on how severe the distress, the firm may be tempted to underinvest or engage in risk-shifting. A firm in financial distress can attempt to reschedule its debt, raise cash via the issuance of new securities, or sell some of its assets. It is important to distinguish between financial and economic distress. Bankruptcy proceedings are intended to do so by directing assets to their greatest use. In some cases this means liquidating the assets and dissolving the firm, whereas in other cases it means reorganizing the firm and its financing to preserve going-concern value. There is debate over whether markets or courts are better at resolving distress. There is evidence that competing firms experience a stock price drop on the bankruptcy announcement, indicating that the announcement signals poor industry conditions [see Mitchell and Mulherin (1996)]. However, firms in concentrated industries with low leverage have price increases. Zender (1991) discusses optimal security design that implements efficient investment. Bankruptcy is the mechanism that creates a state-contingent transfer of control. The direct costs of bankruptcy are relatively small. Indirect costs may be more significant, but are hard to measure. Liquidation costs are distinct from costs of financial distress and arise in the process of selling assets.
108
5.8.1
CHAPTER 5. CORPORATE FINANCE
Factors Affecting Reorganizations
Free Rider Problem A debt restructuring requires unanimous approval of all holders of a class of security. To get around this requirement, a firm can use an exchange offer, where security holders have the right to participate in the exchange. Since the restructuring is designed to increase the health of the firm, the old debt increases in value. Therefore, some of the bondholders may hold out of the exchange and capture this increase in value. Since all the bondholders (in a given class) have the same incentives, the exchange is likely to fail. This problem can be solved with different indenture provisions ex ante, or with coercive participation. Examples of indenture provisions include granting a trustee the right to accept offers on behalf of the bond holders, requiring only a majority for approval, or including a “continuous” call provision. Coercive methods include ex post modification of the covenants directly. Information Asymmetries Insiders and outsiders may disagree about the value of the firm due to differential information. Further, they may have incentives to intentionally misrepresent the value of their claims. The state of financial distress may be misrepresented as well (e.g., discount bonds). Insiders of a firm with poor prospects may hide the truth, whereas insiders of a firm with better prospects may claim they are in distress hoping for a favorable debt renegotiation. Intermediate payments such as coupons and deviations from the APR rule can reduce these problems. Agency Costs The various investor groups and managers have different incentives in the bankruptcy process, leading to conflicts of interest. Some of these groups may join together to form a coalition to increase their bargaining power. Managerial Behavior Fama (1980) posits that a competitive market for managerial talent is an important mechanism to control the behavior of corporate managers. Managerial behavior is likely to be influenced by financial distress for several
5.8. FINANCIAL DISTRESS
109
reasons, including direct financial effects, potential loss of future income, loss of firm-specific human capital, loss of power/presige, and reputation effects. It difficult to observe managerial ability, so it is hard to tell if financial distress is due to poor management, the wrong incentives, or an adverse environment. There is evidence of increased board turnover prior to financial distress, just when the board is most needed to monitor the managers. Management of distressed firms are many times more likely to experience turnover than managers at healthy firms. There is also evidence that the role of the board changes after restructuring.
5.8.2
Private Resolution
Private resolution of financial distress involves activities outside formal bankruptcy proceedings. Common techniques include exchange offers, tender offers, covenant modification, maturity extension, or rate adjustment. Evidence on Restructurings Asset and financial characteristics jointly affect the choice of restructuring mechanism. Private workouts are more common for firms with (i) more intangible assets, (ii) fewer classes of debt, and (iii) greater reliance on bank financing. There is evidence that the market is capable of predicting whether a workout will be successful, and that workouts are a more efficient form of reorganization than Chapter 11.3 Evidence from the Japanese markets indicates that firms with close ties to a main bank are able to invest more and increase sales more following the onset of financial distress. The close relationship with the main bank internalizes some of the free rider and asymmetric information problems. Asset Sales A firm may sell some of its assets to relieve its financial distress. Asset sales may be different for distressed firms than for healthy firms. As discussed in Section 5.15, Shleifer and Vishny (1992) suggest that the secondary market for interfirm asset sales may be subject to adverse liquidity problems. The 3
This may be misleading since the firms that choose Ch. 11 may have done so optimally given the characteristics of their bankruptcy.
110
CHAPTER 5. CORPORATE FINANCE
purchaser may be exposed to unique risks in the transaction with the distressed firm, or they may also be distressed if there are industry problems. These factors combine to reduce the attractiveness of the asset sale. Evidence indicates that asset sales among distressed firms are more common when the firm has several divisions. New Capital If the firm still has good projects it may wish to acquire additional capital. If the firm is in distress, it may have difficulty raising capital, as in Myers (1977). Underinvestment arises because much of the benefit from the new capital goes to the old debtholders. To solve this problem, new securities should be senior and/or asset-backed.
5.8.3
Formal Resolution
Since a firm can generally choose private or formal bankruptcy proceedings, the cost of bankruptcy will be the lesser of the two. The ability to choose avenues will cause many of the features in the formal proceedings to appear in the private resolutions as well. Liquidation (Ch. 7) Reorganization (Ch. 11) • automatic stay – stops principal and interest payments to unsecured creditors – secured creditors lose rights to collateral, may receive “adequate protection” payments – effectively extends maturity of debt – Executory contracts can be assumed or rejected – reduces blocking power of debtholders and leads to renegotiation • debtor-in-possession – Current management and directors typically retain control – Management can file reorganization plan within 120 days, extensions are common – incremental senior borrowing is allowed, strip seniority/collateral from existing debt • Negotiation – All classes of creditors and court must approve agreement
5.8. FINANCIAL DISTRESS
111
– “Cramdown” forces creditors to accept the plan – Management delays are a bargaining tool transferring wealth from debt to equity – Debtors have favorable bargaining power – power given to management/equity viewed as compensation for not exercising option to delay or shift risk Chapter 11 has ambiguous effects on efficiency, but provides the greatest economic benefit when underinvestment is a problem.
Prepackaged Bankruptcy A firm is allowed to simultaneously file for bankruptcy and give its plan of reorganization. This allows the firm to get the efficiency of the private restructuring, yet retain some of the benefits of the formal proceeding (e.g., the cramdown and certain tax benefits). Prepacks may also reduce the holdout problem inherent in workouts.
5.8.4
Key Papers
Ross (1977) Ross (1977a) develops a theory incorporating managerial incentives into the capital structure decision. Insiders have private information and are compensated by a known incentive schedule. Since the manager incurs a penalty if the firm goes into bankruptcy, the amount of debt is a valid signal since it is costly for the managers, and more so for managers at lower-quality firms. This signal then influences the market’s perception of the firm’s risk, although it does not affect the actual risk. The M&M irrelevancy result holds within a risk class, but there is an optimal capital structure for each firm type. There are several empirical predictions from this model. Cross-sectionally, the cost of capital will be unrelated to the financing decision, although the debt level is uniquely determined. Bankruptcy risk should be an increasing function of firm type and debt level. Finally, value should increase with leverage in the cross-section.
112
CHAPTER 5. CORPORATE FINANCE
James (1995) The paper by James (1995) attempts to understand the conditions under which a bank will take equity in a distressed firm. Bank debt is generally thought to be easier to renegotiate than public debt since coordination is easier. Banks have limited incentives to make unilateral concessions, since this will create a wealth transfer to junior claimants. Banks are more likely to take equity when bankruptcy costs are high, such as when a firm has significant growth opportunities. James examines roughly 100 bank debt restructurings in the 1980s. In some cases the firms attempted restructuring of public debt as well. The restructurings involved either forgiving financial obligations or modifying the terms of the debt. There are five general findings. First, whenever the bank takes equity the public debtholders (if any) also take equity. Public bondholders are much more likely to act unilaterally than are banks. Second, banks tend to make larger concessions when there is no public debt. Third, banks also tend to take relatively large equity positions and hold them for several years. Fourth, banks are more likely to take equity when the firms has a small proportion of public debt, more valuable growth opportunities, greater cashflow constraints, poor prior operating performance. Finally, the firms in which banks take equity tend to perform better subsequently than the ones in which they do not. Hotchkiss (1995) The basic goal of Hotchkiss (1995) is to see if Chapter 11 bankruptcy proceedings are effective in reviving troubled companies. Results indicate that a large number of firms are not viable after the reorganization and that existing managements’ role in the process is associated with continued poor performance. The latter point may mean either that the process favors management or that these distressed firms have difficulty in attracting new managers. The paper includes an analysis of post-bankruptcy operating performance in terms of accounting profitability, deviation from cashflow projections, and subsequent distress. Many of the firms increase in size shortly after bankruptcy. The firms begin with average profitability in their industry five years prior to bankruptcy. Closer to the filing, performance deteriorates. Following confirmation of the plan performance improves somewhat, but a
5.8. FINANCIAL DISTRESS
113
number of firms continue to have trouble. The cashflow forecast errors are significantly negative each year, beyond any industry effect. This result may be due to incentives to make high forecasts; the managers who remain in control tend to make overly optimistic forecasts. Finally, roughly a third of the firms file for a second restructuring within a few years. Logit regressions provide additional evidence about firm characteristics. Large firms are more likely to emerge as public companies and are less likely to report negative operating income. There is strong evidence that retaining the pre-bankruptcy CEO is positively related to poor post-bankruptcy performance. Finally, there is some evidence that firms filing in New York are more likely to remain in distress. Weiss (1990) Weiss (1990) performs an examination of the direct costs of bankruptcy and violation of the absolute priority rule. He finds direct costs average about 3% of firm value (20% of equity value) the year prior to bankruptcy. The absolute priority rule is frequently violated, especially in New York. There is no evidence that these cases are resolved more quickly. Larger transactions are more likely to violate strict priority since there are more opportunities to extract concessions. One view is that the violation of APR is to compensate equityholders for not exercising their option to delay the proceedings or pursue actions detrimental to the senior debtholders. Evidence suggests that equity markets anticipate the deviation from APR, and the junior debt incorporates a premium for APR violations. Betker (1995) In order to understand the effectiveness of prepackaged bankruptcies, Betker (1995) documents the costs and sources of economic gain associated with this method. The time spent in bankruptcy is much shorter, 2.5 months in a prepack versus 25 months in Chapter 11. The total time including preliminary negotiations is similar to Chapter 11 and is long relative to workouts. The direct costs are estimated to be about 3%, very similar to the results in Weiss (1990) for Chapter 11 proceedings. Indirect costs in a prepack may be lower, but it is not clear by how much. It is possible that the indirect costs would be similar to a workout. Prepacks appear to offer some tax advantages
114
CHAPTER 5. CORPORATE FINANCE
over workouts in treatment of NOLs, but not CODs.
5.9
Equity Issuance
The security issuance decision involves many of the same issues as capital structure. In addition, the issuance process creates other considerations. Seasoned Equity Offerings (SEOs) are similar to Initial Public Offerings (IPOs) in many respects. The primary difference is that there is an existing market price from which the valuations can be based. This section discusses the common elements and the specifics of SEOs. The following section addresses the issues particular to IPOs. Smith (1986) provides a review of the theory and evidence on security issuance. There are several theories that attempt to explain the empirical evidence. There may be an optimal capital structure in which case optimizing firms should have non-negative valuation effects for capital structure changes. The issuance could serve as a signal of decreased cashflows as in Miller and Rock (1985). The degree of predictability will also influence the size of the announcement reaction. Since debt principal repayment is predictable, debt reissuances should also be more predictable and have smaller announcement effects. A similar argument can be made with high dividend yield firms such as utilities. As in Myers (1984) and Myers and Majluf (1984), when information asymmetries are large the price impacts should also be larger. Finally, changes in ownership concentration such as equity carve-outs may affect value. Table 5.3 summarizes these predictions and the related evidence. Stylized Facts • Retained earnings are most common source of financing • Debt is used more than equity, net retirement of equity in 1980’s • Increased use of leverage over time • Equity is issued relatively more frequently during expansions • Private placements are becoming more important • Gradual switch from rights to firm commitment • Strong preference for firm commitment for non-equity issues • IPOs use firm commitment (60%) or best efforts (40%) • DRIPs and ESOPs have replaced rights offerings • Underwritten offers are more expensive (directly), but more common
Theory Optimal Capital Structure Info. Asymmetry Myers and Majluf (1984)
Prediction AR > 0 AR < 0, more so for securities with high info. asymmetry
Signaling Miller and Rock (1985) Ownership Concentration Eckbo and Masulis (1992) Predicatability Prabhala (1993)
Issue signals lower earnings Use underwriting with disperse ownership Smaller reaction to predictable issues
Evidence Opler and Titman (1995) Yes: Mikkelson and Partch (1986) Eckbo and Masulis (1992), Opler and Titman (1995). No: Helwege and Liang (1996) Opler and Titman (1995) Yes: Mikkelson and Partch (1986), Prabhala No: ? Eckbo and Masulis (1992)
5.9. EQUITY ISSUANCE
Table 5.3: Theories of Security Issuance Reactions
Prabhala (1993) Mikkelson and Partch (1986)
115
116
5.9.1
CHAPTER 5. CORPORATE FINANCE
Flotation Methods
Since the use of an underwriter has higher direct costs than a rights offer, there must be some indirect benefits provided by the underwriter. The flotation choice can be viewed as an attempt to signal firm quality. The underwriter also acts as a monitor or certifying agent. The best firms will use standby rights offers, medium quality firms will use uninsured rights, and the worst firms will use firm commitment underwriting. The flotation choice can also be viewed as an optimal risk sharing contract in a principal-agent problem, where the issuer is the principal and the investment bank is the agent. The issuer wants to incent the banker to expend effort which is difficult to measure or observe.
Firm Commitment In a firm commitment the investment bank assumes the risk of the offer. It essentially buys the offer from the issuer and is responsible for selling it. The process begins with an SEC filing. Next, a preliminary prospectus stating a range of offer prices and the maximum number of shares is issued. After SEC approval, the final offer price is set and a final prospectus is issued. The underwriter’s guarantee begins once the final offer price is set. Competition among underwriters has led to the “bought deal,” where an investment bank will buy an entire issuance outright. Firm commitment becomes more attractive with less asymmetric information, more risk-averse issuers, less riskaverse underwriters, less price uncertainty, or when the investment bank’s effort is more observable.
Best Efforts In a best efforts offer the underwriter acts as a marketing agent on behalf of the issuer. The issuer bears the risk of the offering. The filing process is similar to a firm commitment offer, except there is a minimum sales level below which the offer will be withdrawn. After SEC approval, the underwriter attempts to sell the issue during a selling period. Average initial returns are higher with best efforts offerings.
5.9. EQUITY ISSUANCE
117
Rights Offers Current shareholders are given short-term warrants in proportion to their shareholdings. Shareholders can either exercise the warrants or sell them. The subscription price is typically 15-20% below the current market price. Sometimes rights offers use standby underwriting to guarantee the proceeds of unsubscribed shares. Rights offers in the U.S. are typically fully subscribed. Indirect Issuances Convertibles, warrants, options, DRIPs, ESOPs are examples of indirect methods of equity issuance. Stein (1992) develops a theory for convertible issuance as ‘back door” equity financing. DRIPs and ESOPs have replaced rights offerings to some extent. Shelf Registration The issuer can pre-register for the issuance of a security over a two year period. This can reduce the direct costs of issuance but it increases the information asymmetry problem since it is easier for managers to time their offers. Negotiated Bid A firm can select its investment bank through either a negotiated or competitive bid process. Negotiated bids are more common, especially among larger issue, even though they are more expensive. The main users of competitive bid offers are utilities, which are required to do so. Possible explanations include side payments to managers, increased accounting-based compensation to managers, lower variability in costs, reduced agency costs, and protection of proprietary information.
5.9.2
Direct Flotation Costs
A summary of direct floatation costs is shown in Panel A of Table 5.4. Direct flotation costs are generally higher for equity issues than for other securities. They also tend to be higher for industrial companies than for utilities.
118
CHAPTER 5. CORPORATE FINANCE
Table 5.4: Some Issuance Costs Panel A: Direct Flotation Costs Method Industrial Utility Rights 1.8% 0.5% Standby Rights 4.0% 2.4% Firm Commitment 6.1% 4.2% Panel B: Seasoned Issue Valuation Effects Security Industrial Utility Equity –3.14% –0.75% Conv. Preferred –1.44% –1.38% Preferred –0.19% 0.08% Conv. Debt –2.07% Debt –0.26% –0.13%
Convertible debt offers have higher flotation costs than similar sized nonconvertible offers, consistent with the hypothesis that issue costs are related to security volatility. Not surprisingly, underwriter compensation is higher in negotiated contracts than in competitively bid contracts. Underwriter compensation has decreased since the introduction of shelf-registration, although this may be due to selection bias issues. Several firm characteristics are correlated with direct flotation costs [see Smith (1986) and Eckbo and Masulis (1992)]. The models use direct flotation costs as a percentage of issue proceeds as the dependent variable. A positive intercept indicates there are fixed costs to the issuance. Measures of size indicate that the costs are a decreasing, convex function of size, indicating there are economies of scale. High shareholder concentration also lowers issuance costs (this may be due to an increased reliance on subscription precommitments). The direct costs are positively related to stock volatility. Dummy variables indicate that rights offers have the lowest direct flotation costs, and firm commitment offers the highest. These results are robust to the time period used and across industrial and utility firms. Issuers often grant an overallotment option, allowing the underwriter to purchase additional shares if the offer is oversubscribed. This increases the underwriter’s incentive to sell the issue, reducing the risk of failure.
5.9. EQUITY ISSUANCE
5.9.3
119
Indirect Flotation Costs
Given the lower direct costs of rights offers, but the preference for firm commitment offers, indirect expenses may be important. Managers may receive personal benefits from underwriters, or there may be pressure from investment bankers who sit on the board. Also, sales to the public are more likely to create a more disperse ownership structure, either reducing the monitoring of managers as in Shleifer and Vishny (1986) or increasing liquidity as in Merton (1987). Expected rights offer failure costs are small. Other indirect costs may include the capital gains taxes and transactions costs to the shareholders associated with a rights offer. There may also be anti-dilution clauses and wealth transfers to convertible security holders.
5.9.4
Valuation Effects
Leverage increasing transactions produce positive ARs, while leverage decreasing transactions have a negative effect. There is an average negative price impact of SEO announcements of about 3%. This contrasts to no significant price impacts for the announcement of straight debt, equity sold through rights offers, or private placements. Common stock offer cancellations are also associated with positive reactions. Indirect equity issuances (e.g., convertible debt) are also associated with negative announcement reactions. Evidence on shelf registration indicates a more negative reaction, which is consistent with the increased adverse selection problem. These valuation effects are summarized in Panel B of Table 5.4. There are several possible explanations for these valuation effects. If there is an optimal capital structure, then a change to restore the optimal level should be met with a positive reaction. The nonpositive announcement effects do not support this hypothesis, although the announcement may also convey information about the firm’s situation. This signaling effect, as in Ross (1977a), implies that leverage-decreasing events signal negative revisions in management’s expectations, and should be accompanied by a negative price reaction. Under the Miller and Rock (1985) model, any security issuance signals lower than anticipated operating cash flow and is bad news. There is some evidence indicating firms tend to issue debt following earnings declines, whereas equity issuances tend to come before an abnormal earnings decline. The adverse selection problem in Myers and Majluf (1984) can also ex-
120
CHAPTER 5. CORPORATE FINANCE
plain some of the valuation effects. In an extension to this framework, managers who choose the size of the offer choose larger offers when their stock is more overvalued. Other adverse selection problems arise because the underwriter is able to distribute shares in the “good” offers to preferred clients. Also, as in Rock (1986), with differentially informed investors the underpriced issues will be oversubscribed, while the overpriced ones will go to the uninformed investors. There are some (partial) solutions to these adverse selection problems. The firm can try to change managerial incentives, use private placements, maintain financial slack, use certifying institutions, use equity carveouts, or issue convertible securities. Much of the empirical evidence is consistent with the adverse selection hypotheses. The announcements have nonpositive effects regardless of security type but larger or riskier issues have more negative reactions. Firm commitment offers have the most negative reactions, followed by standbys, then uninsured rights. These results are consistent with the model of Eckbo and Masulis (1992). Other supportive evidence includes the more negative reaction for industrial issues than for utilities, and more negative reactions to shelf registration announcements.
5.9.5
SEO Timing
There is some evidence supporting the hypothesis that equity offers will be more frequent in an expansion. The argument is that there are more profitable investment opportunities in these times, and firms are less likely to forego investment projects because of underpricing. Additional evidence indicates that the announcement effect is less negative during expansions for equity issues, while announcements of debt issuances are not effected. The Myers (1984) pecking order hypothesis suggests firms will issue equity in economic downturns because they are less likely to have excess cash and their leverage is likely to have increased as market values of equity have fallen. There relative regularity of debt issuances raises the possibility that the announcement effect is small because the market anticipates the issuance. Jung, Kim, and Stulz (1996) and Opler and Titman (1995) provide evidence that the debt-equity choice is predictable.
5.9. EQUITY ISSUANCE
5.9.6
121
Key Papers
Eckbo & Masulis (1992) Eckbo and Masulis (1992) model the choice between a rights issue and an underwritten offer as an extension to Myers and Majluf (1984). In the model, shareholder takeup k is an important determinant of the flotation method. Firms using uninsured rights offers may use subscription precommitments to credibly signal a high takeup. The precommitments in rights offers and underwriter certification in firm commitment offers serve to reduce the wealth transfer between current shareholders and outsiders. Firms with more dispersed ownership will tend to choose underwriting. Firms with less discretion over their issuance, such as a utility, will tend to use a rights offer. The model predicts that the announcement effect will be most negative for firm commitments offers, followed by standby rights and uninsured rights. This analysis can be applied to other flotation methods as well. To analyze the determinants of direct costs, they estimate a regression with measures of size, percentage change in shares, ownership concentration, return standard deviation, and dummies for offer type. The results indicate there are significant fixed costs and economies of scale. A positive coefficient on the change in shares variable indicates there are adverse selection costs. High ownership concentration lowers direct issuance costs, perhaps through precommitments. More volatile returns are associated with higher costs since there is increased underwriting risk. After controlling for issue characteristics, rights offers are still less expensive than standby or firm commitment offers. Having documented the rights issue paradox, the authors present a model to explain it. In the model, firms will issue if the value of the projects exceeds the direct cost and dilution from issuing undervalued securities, b−(f +c) ≥ 0. The cost c(k, m) depends on the level of existing shareholder participation. Managers select the flotation method m to maximize firm value. The market gets information about k through precommitments, trading volumes and actual subscription levels. With full participation the dilution cost is zero. When k < 1 some undervalued firms will find it too costly to issue. In this sense k is similar to the inverse of slack. High-k firms select uninsured rights and use takeup to substitute for the underwriter guarantee. Firms with k ∈ (kf , ks ) will choose
122
CHAPTER 5. CORPORATE FINANCE
standby rights. The lowest k firms will not bother paying the additional rights distribution costs and will just use firm commitment offers. If a firm is overvalued then high-k firms may choose either uninsured rights or they may “hide” with a firm commitment offer. If they are detected they will sell at a lower price or cancel and forgo the project. Since the market understands these strategies, the high-k firms will face the lowest adverse selection costs and the low-k firms the highest costs. The authors test their model using an event study methodology. Consistent with prior literature, the negative market reaction is strongest for firm commitment offers and weakest for uninsured rights. After adjusting for flotation costs, either type of rights issue has a negligible effect. Reactions are less negative for utilities, consistent with smaller adverse selection. Firm commitment offers are generally associated with stock price runups, whereas this effect for standby or uninsured rights are smaller or negligible, respectively. Mikkelson & Partch (1986) In a study similar to Masulis (1980), Mikkelson and Partch (1986) reexamine the effect of announcements of capital structure changes on stock price to better understand the determinants. They find a significant negative announcement effect for stock and convertible debt, and a less pronounced effect for debt. Completed offerings have positive returns between announcement and issuance, and a negative return at the issuance, indicating that firms time their security issuance. Similarly, firms that refinance have more negative reactions than those who raise funds for capital expenditures. In general, the results are consistent with the predictions of Myers and Majluf (1984) and the notion of a pecking order. The paper uses an event study methodology to measure excess returns. The estimation window for α and β are the 140 days beginning 21 days after issuance or cancellation. Throughout their sample the number of announcements varies considerably across time. External financing is not a common event for many firms, consistent with the pecking order hypothesis. Equity offers tend to finance new assets. Among public offerings for cash, stock has the most negative abnormal return at about –4% and straight debt the least negative reaction. These results are consistent with the predictions in both Myers and Majluf (1984) and Miller and Rock (1985), although the latter paper does not distinguish between security types.
5.9. EQUITY ISSUANCE
123
Table 5.5: Price Reactions by Issuance Type Issuance All Equity All Debt Completed Equity Cancelled Equity Completed Debt
Pre-AD + –
AD AD-ID ID – 0 + – – + 0 0
The abnormal returns surrounding these events provide evidence that manager time security issuance. Prior to the announcement, equity offers tend to have runups, while debt offers tend to have declines. At the announcement, the equity offers have the most negative returns and the debt offers the least negative. For completed offers, equity again has a runup between the announcement and issuance, while debt is essentially flat. At the issuance there is another negative effect for equity offers and a neutral reaction to debt issuances. In general, convertible debt and preferred stock fall in between the debt and equity effects, although small sample sizes make interpretation more tenuous. A more direct analysis of cancelled and completed offers confirms that the cancelled offers have declines between the announcement and cancellation, while completed offers increase in price between announcement and issuance. Further, at the cancellation there is a positive return versus a negative return at the issuance. Note that these patterns are ex post, and it is not likely that there are any profitable trading strategies. An effort to determine whether debt ratings make a difference is inconclusive due to small sample sizes, but announcements of bank credit lines are associated with positive abnormal returns. Opler & Titman (1995) If there is an optimal capital structure then firms experiencing an equity price runup should issue debt to move back towards the optimum. Evidence that firms issue equity after a runup seems to indicate the opposite. Opler and Titman (1995) address this issue by seeing if deviations from an estimate of the optimal debt ratio are useful in predicting whether the firm issues debt. The general results indicate that firms do move towards a target debt
124
CHAPTER 5. CORPORATE FINANCE
ratio. A puzzling finding is that the security choice of firms least subject to information asymmetry are the most sensitive to recent returns. There are several possible explanations of equity issuance after run-ups. The optimal capital structure could simply change over time. If firms whose growth opportunities improve have price runups, these firms should desire relatively more equity financing. An agency theory explanation is that additional debt constrains a manager’s ability to grow and raises the probability of default and firing. The observed behavior is also consistent with the Myers and Majluf (1984) model where the firms with overvalued securities issue. A behavioral explanation is that managers want to avoid dilution as a rational response to an irrational market. The analysis is performed in two stages. In the first stage debt ratios are regressed on proxies for growth opportunities and size to get predicted debt levels. Deviations from the predicted level and control variables are then used to predict the probability of debt issuance is a second stage. The second stage regressions are further stratified by size, dividend policy, and utilities. Their findings do not fully support any of the proposed theories. Partial support comes from several observations. Profitable firms issue debt or repurchase shares to offset the accumulation of retained earnings. The larger issuances tend to involve equity, perhaps in response to the higher fixed costs. Stock return and M/B are good predictors of equity issuance. The results on convertible debt generally fall between debt and equity. Firms that issue short-term debt are less profitable than equity issuers, whereas long-term debt issuers are more profitable. The results from the stratified regressions are less supportive of the theories. Utilities, firms that pay dividends, and firms followed by more analysts are more sensitive to recent returns in their security choice. Small firms are less sensitive to price runups. In the more active market for corporate control in the mid-80s, there is no evidence that managers are less willing to issue equity following a stock price decline. Helwege & Liang (1996) Helwege and Liang (1996) test the pecking order theory using a sample of IPOs. The basic design is to identify a cohort of firms going public in 1983 and follow their financing choices through time. The general finding is that there is little support for the pecking order. The probability of external
5.9. EQUITY ISSUANCE
125
financing is unrelated to internal cash shortages and financing patterns indicate an “overuse” of equity. The study starts with 367 firms. Over the next decade a roughly equal number go bankrupt, are acquired, and survive. The firms tend to have losses early in their lives then show increases in profitability. Dividends are rarely paid and there is a tendency to rely on internal funds over time. Firms seem to choose private debt, then equity, then public debt. Large firms tend to issue debt, whereas small firms or low growth firms use more private debt. Coefficient estimates on default and asymmetric information variables are mostly inconsistent with the pecking order. Riskier firms tend to issue more equity. Jung, Kim & Stulz (1996) Jung, Kim, and Stulz (1996) perform a test comparing the issuance decision, market reaction, and subsequent actions predicted by the pecking order [Myers (1984)], agency [a special case of Myers and Majluf (1984)], and issuance timing [Loughran and Ritter (1995)] theories. The findings are consistent with agency theory, partially support the pecking order, and do not support issuance timing. Under agency theory managers will issue when its shares are overvalued to maximize current shareholder value (assuming it can not issue riskless debt). This setup is a special case of Myers and Majluf (1984) where there is no information asymmetry about the assets in place but the manager has incentives to issue equity to take negative NPV projects that are privately valuable. The agency cost of outside equity arises because of managerial discretion. Issuing debt instead reduces the manager’s discretion, but gives rise to the underinvestment problem of Myers (1977) since gains go to the bondholders first. Thus, high growth firms will have less leverage to avoid foregoing good projects. The pecking order says firms should issue debt instead of equity whenever possible. The timing model predicts firms will issue equity when overvalued, so subsequent returns should be lower. The analysis is based on a sample of debt and equity issues between 1977 and 1984. Equity issuers tend to be smaller, riskier, growth oriented firms. The security issue choice is estimated with a logistic regression. High M/B firms, leading indicators, and recent returns are positively related to the choice of equity. Firms with high taxes are less likely to issue equity. Firms that are predicted to issue debt but instead issue equity appear to overinvest.
126
CHAPTER 5. CORPORATE FINANCE
Overall, abnormal returns are negative for equity issues and insignificant for debt issues. For equity issues with high prior excess returns the announcement abnormal return is positive. The correlation from the firm type predicted in the logit regression and abnormal returns is positive for equity issues and negative for debt issues. This is evidence supportive of the agency theory but not the pecking order theory. The general results indicate that firms issuing equity tend to either have valuable growth opportunities or lack valuable growth opportunities but have excess debt capacity. These firms lacking valuable growth opportunities have more negative stock price reactions to announcement of equity issuance. Other evidence indicates that some firms issue equity to benefit the managers rather than the shareholders.
5.10
Initial Public Offerings
This section discusses the features that distinguish IPOs from seasoned issuances. The primary difference is that valuing an IPO is more difficult than valuing an existing public firm. Essentially all the problems with SEOs remain with additional complications. The potential benefits of going public include (i) diversification, (ii) liquidity, and (iii) more capital to take good projects. The costs include (i) information collection/disclosure, (ii)legal, auditing (iii) underwriting and one-time direct issuance costs, (iv) management time and effort, and (v) dilution. Much of the general discussion in this section comes from Ibottson and Ritter (1995). IPO’s are a stage in the life cycle of a firm. Initially, firms will be selffinanced since the capital requirements are the smallest and the information asymmetry problems the largest. The next step is often financing by friends, relatives, and associates. Personal relationships serve to align the interests of the manager and the investors. Next comes non-affiliated sources of private capital auch as bank financing and venture capital. These owners typically require a large amount of information disclosure and are often active investors. After exhausting available private financing a firm will go public. IPOs are characterized by information asymmetry problems. In particular, there are adverse selection problems since the owners self-select into going public, and moral hazard problems since the manager/owners affect the value of the firm. There are several mechanisms to deal with the informational asymmetries. By holding a sizeable portion of the firm the manager
Theory Winner’s curse Costly info. acq. Cascades I-banker power Lawsuits Signaling Regulatory Wealth redist. Stabilization Ownership disp. Market incompl.
Prediction riskier issues, greater underpricing upward revisions more underpriced underprice to guarantee sucess no underpricing of IB IPOs underprice to avoid lawsuits underprice IPO for successful SEO utilities less underpriced bribe underwriter support generates return underpricing creates diverse ownership compensation for risk-bearing
Evidence some yes no mixed no
5.10. INITIAL PUBLIC OFFERINGS
Table 5.6: Theories of IPO Underpricing
limited no some
127
128
CHAPTER 5. CORPORATE FINANCE
provides a signal to outsiders. Managers may agree to a “lock up” period, during which they will not sell their shares. A manager could also take a small fixed salary in exchange for a contingent compensation scheme. Firms may hire certifying agents4 who have credibility arising from their desire to protect their reputation capital. There is evidence supporting the role of certifying agents [Booth and Chua (1996)].
5.10.1
IPO Anomalies
There are three puzzling observations with respect to IPOs. The issues are significantly underpriced from their secondary market values, yet over the longer term IPOs tend to underperform. There are also cycles in the extent of underpricing. New Issue Underpricing Initial returns are skewed, with a positive mean and median near zero. Smaller offerings are more underpriced so equally-weighted returns overstate degree of underpricing. The underpricing effect is present internationally. Underpricing can be viewed as the solution to a moral hazard problem between the issuer and the underwriter. The degree of underpricing will increase with demand uncertainty. Rock (1986) argues there will be a winner’s curse due to an adverse selection problem arising from an information asymmetry between informed and uninformed shareholders. In this model the market price and underwriter offers are jointly determined in equilibrium and the selling mechanism is exogenous. The banker sets an optimal offer and the market reacts. The informed agents will only buy IPOs if they are good deals, in which case they will be oversold and the uninformed will not be able to get their desired amount. In the case where the IPO is overpriced, the informed will pass, leaving the full amount to the uninformed. The uninformed will realize this and require a discount on all issues in order to buy any of the IPOs. An implication is that underpricing should be greater for riskier issues. Koh and Walter (1989) provide direct evidence in support of the model. Welch (1992) presents a model of information cascades where agents’ decisions are influenced by the actions of other agents. Firms will underprice 4
See James (1995) or Smith (1986).
5.10. INITIAL PUBLIC OFFERINGS
129
to get the first few investors to participate, starting a cascade. Booth and Chua (1996) argue that shares are more valuable to investors when they are liquid. Providing a more dispersed ownership structure will increase the liquidity of the shares. Shares are underpriced to compensate a broad investor base for costly information acquisition.
Long-Run Underperformance The is significant evidence that IPOs perform poorly after the initial large returns. The magnitude of this underperformance is on the order a –15% CAR over the following three years. This type of underperformance is also present in closed-end funds and REITs. There is some evidence supporting each of the following theories of underperformance. The divergence of opinion argument of Miller (1977b) is that the buyers of IPOs are the most optimistic. With greater uncertainty, the difference between the optimistic and the pessimistic is larger. As time goes on, information will be revealed that will cause the difference of opinion to converge, and therefore the price will drop. There is some survey evidence supporting this theory. The impresario hypothesis suggests that investment bankers underprice initially to create the appearance of excess demand. Under the windows of opportunity hypothesis there is a sort of dynamic pecking theory where firms will issue equity when it is overvalued in general.
Cycles Cycles in both volume and underpricing are well documented but hard to explain as rational. One explanation is changing risk composition, meaning more risky offerings are underpriced more, and there may be a clustering of IPOs by similar firms. There is some evidence in this direction, but it is not entirely convincing. A second explanation is “positive feedback” strategies, where investors buy IPOs expecting positive autocorrelation. If enough investors do this, the autocorrelation becomes a self-fulfilling prophesy (this is basically the “greater fool” theory). This effect may be difficult to stop with arbitrage since is difficult to short-sell the IPO [Rajan and Servaes (????)].
130
5.10.2
CHAPTER 5. CORPORATE FINANCE
Key Papers
Welch (1992) In his “cascades” model Welch (1992) provides an explanation of IPO underpricing. Investors are sequentially asked if they want part of the IPO. Each investor gets a signal of the value, but the investors can not observe each others signals. Investors are able to observe the actions of those that went before them. Using the decisions of others to update their own beliefs, information cascades can occur where individuals disregard their own information and follow the masses. Issuers may underprice to ensure the first few investors accept and ensure success of the offer. Cascades are not necessarily bad for an issuer. He is at less of an informational disadvantage since the individuals are unable to aggregate their information. The underwriter seeks to distribute the offer widely to make investor communication more difficult. With inside information, a high offer price increases the probability of offer failure and more so for lower quality issuers, creating a separating equilibrium. The model uses an economy of rational, risk-neutral investors. The unknown true price of the firm is V , and the issuer has a reservation price V P ≤ V L < V H . All participants have a prior V˜ ∼ U [V L , V H ]. The price can be expressed as p = θV H + (1 − θ)V L where θ ∈ [0, 1] indexes the firm type. Each investor gets a signal s ∈ {H, L} with Prob[si = H] = θ. With perfect communication all successful offers are underpriced. Observed ex post underpricing is strictly increasing in uncertainty. If communication can only go from early to late investors things change slightly. Issuer proceeds are path-dependent, but in large economy the perfect information results obtain. When only decisions, and not information, are observable, things change dramatically. Once an investor M with an H decides not to invest, no subsequent investors will invest. Similarly, once M with an L invests, all that follow him will also invest. This says that as the game goes on, individual signals get less weight relative to the information from previous decisions. As soon as one person goes against their information, anyone after him would place even less weight on that information. With cascades and an infinite number of investors the probability of failure is zero for P ≤ 1/3 and one for P ≥ 2/3. All prices in between have an uncertain outcome. An uninformed risk-neutral issuer will optimally choose P = 1/3 and the offer always succeeds. Since everyone chooses this price and
5.10. INITIAL PUBLIC OFFERINGS
131
the average price is 1/2, the expected IPO underpricing is 50%. Under these conditions, the issuer has higher expected proceeds with cascades than with path dependency or perfect communication. What if the issuer can modify the price based on past sales? Issuers with sufficiently high risk aversion may prefer to start an immediate cascade to path dependency with the option to change the price later. The model has a number of implications. First, when distribution is less fragmented (a local issue) the issuer will underprice more. For these issues the offer price decreases with the issuer’s risk aversion and capital requirements. Issuers have an incentive to prevent communications to preserve the cascade. Welch argues that the winner’s curse in Rock (1986) is not important, since success of the offer is a foregone conclusion by the time the “marginal” investor is approached. To add another element of realism, issuers are given inside information about the firm type which is correlated with the outside signals. This makes it relatively less expensive for a high quality firm to raise the price than for a low quality firm, creating a separating equilibrium. Loughran & Ritter (1995) Loughran and Ritter (1995) attempt to understand the long run underperformance of new issues. They find that only a small part of the underperformance is explained by B/M effects. The degree of underperformance varies through time. The authors calculate three- and five-year returns for a large sample of IPOs and SEOs. The issue date return is not included in the calculations. Returns are also calculated for a sample of non-issuing matching firms. Wealth indices of issuers’ returns relative to matching firms are calculated for the two holding periods by cohort year. In almost all cases these indices are less than one and deteriorate from the three year measure to the five year measure. For SEOs the issuing firms had extremely high (72% on average) returns in the year prior to issuance. This underperformance is not due to mean reversion as in ?. Separating extreme winners into issuers and non-issuers, the issuer’s underperform over the next five years while the non-issuers beat the market. Since many researchers have documented a relation between the crosssection of returns and B/M, the paper tests for this effect. Although size and B/M are significant, a dummy variables for new issues is significantly negative, especially in periods following heavy volume. Using a Fama-French
132
CHAPTER 5. CORPORATE FINANCE
three factor model, intercept estimates for issuers is significantly less than for non-issuers. Also, the issuers have higher betas, which is inconsistent with the lower returns from the previous analysis. Koh & Walter (1989) Koh and Walter (1989) provide a direct test of the Rock (1986) model of IPO underpricing as a response to the winner’s curse. This test is unique in that it uses data from Singapore where rationing of oversubscribed offers is done in a special lottery. In this market all applicants for a given size of the issuance have an equal chance of winning. The tests confirm the implications of the Rock model that there is a winner’s curse and that uninformed investors earn a return similar to the riskless rate. The authors use simulations to generate the returns to different bidding strategies. Assuming no rationing occurs, underpricing is large even after transactions costs. Examination of the probability of an allocation indicates that small investors are more likely to get an allocation. More importantly, investors are nearly three times as likely to get an overpriced issue than an underpriced one. Average returns incorporating costs and the probability of allocation are approximately zero, consistent with the Rock model. Also consistent are the correlations between proportions applied for or allocated to and initial returns. The small investors apply for and get a larger proportion of the issue when the issue is more fairly priced, whereas the larger investors apply for and get more when the issue is underpriced. Both small and large investors’ demands increase for underpriced issues, but the large investors are much more responsive. Booth & Chua (1996) Booth and Chua (1996) explain IPO underpricing as an attempt to generate ownership dispersion and enhance liquidity in the secondary market giving a flavor of Merton (1987). In the model, informed investors are more likely to participate in secondary market trading. The underpricing is set to compensate investors for information acquisition. They find that underpricing is positively related to information costs. Investment banker prestige is negatively related to underpricing in firm commitment offers and unrelated in best efforts. Finally, the clustering of issues seem to lower information costs and underpricing.
5.11. EXECUTIVE COMPENSATION
133
The model has a number of empirical predictions. Underpricing should be negatively related to the probability of receiving an allocation. The costs of achieving ownership dispersion and liquidity should be higher for best efforts offers since they tend to be smaller. Best efforts offers have a higher probability of failure so they should be more underpriced. Finally, best efforts offers should benefit most from clustering. Initial returns are regressed on firm size, offer price, IPO activity in the market and industry, underwriter rank, and interactions with these variables and dummies for offer type. The results indicate that more reputable underwriters underprice less in firm commitment offers, and that clustering is important, especially for best efforts offers.
5.11
Executive Compensation
Murphy (1985) Murphy (1985) reexamines the relation between firm performance and executive compensation. This study focuses on individual executives over time and includes important explanatory variables as well as indirect forms of compensation which prior research has ignored. Murphy finds that executive compensation is strongly positively related to firm performance. The paper attempts to avoid errors in variables problems associated with omitting factors such as entrepreneurial ability, managerial responsibility, firm size and past performance. If these factors are constant over time, then time series regressions for individual executives can correctly assess the sensitivity of pay to performance. The components of compensation under consideration include: salary, bonus, salary + bonus, deferred compensation, stock options, and total. Compensation is purged of any direct relation to the firm’s stock price, and compensation over time is re-expressed in 1983 dollars. The analysis is conducted in two parts. Time series regressions of annual compensation for each executive on measures of performance for each firm-year and dummy variables to control for the executive’s position. The measures of performance include combinations of the stock return and sales growth. There is an intercept for each individual to capture any other important variables which are constant over time. Cross-sectional regressions use average compensation (over time) and average performance.
134
CHAPTER 5. CORPORATE FINANCE
The results indicate that executive compensation changes by about 20% of the firm’s returns. The ranking of sensitivity is CEO, President, Chairman, and Vice President. The bonus is most sensitive to performance and the option compensation is negatively related to performance. The crosssectional regression without sales growth gives the wrong signs, evidence of mis-specification. Adding the sales growth to the regression reduces the position-specific sensitivity in the time series regression and reverses the signs in the cross-sectional regression. Including performance interacted with position gives positive coefficients but the hierarchical ordering fails. In particular, the results indicate that vice presidents have the most sensitive compensation. Using relative performance provides evidence that salaries are positively related to raw returns and negatively related to relative returns, while bonuses are unrelated to raw returns and positively related to relative returns. Jensen & Murphy (1990) Jensen and Murphy (1990) examine pay for top executives to see if they are compensated in a way that will reduce agency costs by aligning incentives. They find that during the mid-70s to mid-80s, executive pay is not particularly sensitive to performance, and most of the sensitivity comes from stock ownership. The methodology regresses different measures on compensation change on changes in shareholder wealth to get a sensitivity estimate. They find that a $1000 change in value causes only a $3.25 change in CEO wealth, and $2.50 of this comes from stock ownership. The sensitivity is greater in small firms, $8.05 versus $1.85. Bonuses are generally very stable and do not seem to reflect changes in performance. Real CEO stock holdings and the level of pay have fallen over time, suggesting that political pressures have constrained the ability to offer pay for performance contracts. The variability of CEO pay has also fallen so that it is no more variable now than general labor, although executives are less likely to receive pay cuts and more likely to receive large raises. Pay seems to be tied to accounting measures rather than market or individual performance. Dismissals do not seem to be an important incentive, mainly because they are rare. Sensitivity does not seem to reflect the managers level of stockholdings. Sensitivity has decreased over time. These results are inconsistent with formal agency models of optimal con-
5.11. EXECUTIVE COMPENSATION
135
tracts. Alternative explanations are that CEOs are unimportant inputs in the production process, actions are easily monitored/evaluated, or there are political or social pressures that “cap” compensation. Of these, only the latter is reasonable. Perhaps managerial risk aversion requires even higher compensation for subjecting managers to performance risk. This is hard to rationalize as the sole reason since the amount of wealth at risk is a relatively small portion of total CEO wealth. Highly sensitive contacts may not be feasible since executives can not credibly commit to paying large amounts in the event of poor performance. It may also be the case that there are non-pecuniary benefits such as power and prestige that do provide the right incentives. However, these factors may incent the manager to be a good citizen rather than maximize share holder value. A weakness of the study is that firm value changes may not be good measures of the CEOs performance. For example, flat performance during a recession may in fact be good. Tests indicate that relative performance is not important, however. There is also an endogeneity problem. Yermack (1995) Yermack (1995) tests nine theories of why companies award executives stock options. The main idea being tested is whether firms with high agency costs increase pay for performance sensitivity with stock options. The primary findings support few of the theories. There is evidence that regulated firms are less likely to use options, while firms with noisy accounting earnings or liquidity contraints will use options more. The analysis uses two possible dependent variables, the option delta times the fraction of ownership or the value of option compensation relative to salary and bonus. The first is a “flow” measure while the latter is a “stock” measure. Measures of option values are based on the Black-Scholes model and include only new awards. A tobit regression incorporates individual firm effects and accounts for the large number of variables with values of zero. The predictions and results are in Table 5.7 Sloan (1993) Sloan (1993) examines the incremental role of accounting figures in determining CEO compensation. The logic follows Fama (1980); accounting earnings do not subject the risk-averse manager to uncontrollable market noise.
136
CHAPTER 5. CORPORATE FINANCE
Table 5.7: Predictions and Results in Yermack Theory Incentive Alignment Horizon Problems Growth Oppty’s Accounting Noise Agency Costs of Debt/FCF Regulation Liquidity constraints Tax loss CF Earnings Management
Prediction – + + + – – + + –
Finding
– + – +
There will be a greater reliance on earnings when: (i) the firm’s stock returns are highly correlated with market noise, (ii) earnings are highly correlated with firm specific signals in returns, or (iii) earnings are less correlated with market wide noise. Thus, accounting earnings are used as an instrumental variable in a sense. Ideally, pay would be a function of actions, but these are not easily observable. Instead, price can be used as a determinant of compensation, but price is a noisy measure. The weights placed on price and earnings reflect the tradeoff between incentive alignment and risk-sharing. There are two important variables in the analysis, the ratio of variance in market wide noise to variance of earnings noise and the correlation between these sources of noise. Both variables are interacted with accounting performance and stock performance. When the ratio of noise variances is large, compensation should be based more on the accounting earnings and less on the returns performance. When the sources of noise are positively correlated the firm will base compensation less on accounting measures and more on stock returns. The results indicate that the variance of noise in returns is less than the variance of noise in earnings. The correlation between market wide noise and earnings noise is close to zero. Sloan finds support for the three hypotheses tested in this paper. First, earnings measures shield executives from market noise. Second, CEO compensation is more sensitive to earnings performance when the returns are noisy relative to earnings. Finally, firms place more emphasis on earnings when the correlation between noise in stock returns and earnings are closer
5.11. EXECUTIVE COMPENSATION
137
to negative one. Bizjak, Brickley & Coles (1993) Bizjak, Brickley, and Coles (1993) explain why firms use multi-year compensation contracts and show that it is not always optimal to tie compensation to current performance when there are informtion asymmetries. This contrasts with the rule of maximizing current stock price in a world of perfect markets and homogeneous expectations [Fama & Miller (1972)]. The basic intuition is that when compensation is based only on current performance there are incentives to maximize the current stock price at the expense of long-run performance, either by under- or overinvesting. Supportive evidence shows that high growth firms use longer contracts. There is no relation between either CEO starting age or tenure and growth opportunities. A surprising result is that the sensitivity of salary/bonus and total compensation to stock performance are lower in high growth firms. In the model managers use the observable investment decisions to manipulate the market’s inference about the firm. The incentive to do so is strongest when the manager is likely to leave the firm before the market fully learns the firm’s type. The compensation plan is then structured to balance the emphasis on current versus future stock price. To test the theory empirically the authors use M/B and R&D as proxies for informational asymmetries with control variables for size and regulated industries. The main analysis uses the ratio of salary and bonus incentives to total compensation incentives.5 Additional regressions use these variables in isolation. The results indicate that firms with high information asymmetries pay a lower proportion of compensation in the form of salary and bonus. Large firms, regulated firms, and high growth firms have total compensation and salary/bonus that are less sensitive to changes in shareholder wealth. Smith & Watts (1992) Smith and Watts (1992) test a variety of theories regarding decisions about financing, dividend, and compensation policies. The evidence suggests that contracting theories are more important in explaining cross-sectional variation in these policies than either tax-based or signaling theories. 5
These are actually the change in each per $1000 change in shareholder wealth as in Jensen and Murphy (1990).
138
CHAPTER 5. CORPORATE FINANCE
Table 5.8: Predictions and Results of Smith & Watts Dependent Variable E/V D/P Comp. Bonus Option
Indendent Variable A/V Reg. Size Ret. – – – + + ? (+) – + (–) + + ? (+) – + – – +
Symbols shown are predictions. Actual results that are significantly different are in parenthesis.
The study considers four endogenous policy variables: E/V for financing, D/P for dividends, CEO salary for compensation, and frequency of option/bonus plans for incentive compensation. Independent variables include book assets to value for the investment opportunity set, size, accounting return, and a dummy for regulated industries. The data are on the industry level. The results indicate that firms with more growth options have lower leverage, lower dividend yields, higher compensation, and more frequent usage of stock option plans. Regulated firms have higher leverage, higher dividend yields, lower compensation, and less frequent usage of stock option/bonus plans. Finally, larger firms tend to have higher dividend yields and higher levels of executive compensation. These results inply relations among the policy variables as well. There should be a positive relation between leverage and dividend yield and between compensation and the use of incentive plans. There should be negative relations between dividend yield and incentive plans and also between leverage and either compensation or incentive plans.
5.12
Risk Management
There are several ways a firm can manage risk, including diversification, insurance (nonlinear), and hedging (linear). To measure the valuation effect of risk management researchers either use an event study or matched samples.
5.12. RISK MANAGEMENT
139
Stulz (1995) Stulz (1995) attempts to reconcile the theories and practice of risk management. Survey data indicate that firms typically hedge transactions and do not engage in speculation or arbitrage. At the same time, managers indicate their view influences the extent of hedging and many large firms view the tresury as a profit center. Large firms tend to use derivatives more than smaller firms. Theories predict gains from risk management may come from several sources. In an efficient market with diversification, these gains must arise only from real resource gains such as reducing costs due to financial distress, taxes, wages, or capital acquisition. Since increases in capital are a substitute for risk management, firms with low leverage are generally not expected to benefit much from hedging. In this sense, hedging allows firms to save capital. Since managers dictate the risk management policy it is important to consider their incentives to reduce or increase risk. The chances of bankruptcy also affect risk management. The lowest risk firms can afford to take bets and the highest risk firms are forced to take bets. Since most of the arguements for risk management focus on left-tail outcomes, methods such as variance reduction are not really appropriate. Value at risk emphasizes the magitude of the loss that occurs with a given probability, but it is not appropriate either. The path of firm value over time is more important than the distribution at a point in time. Froot, Sharfstein & Stein (1993) Froot, Scharfstein, and Stein (1993) develop a theoretical framework describing optimal risk management strategies. The focus is on what and how much hedging should be done as opposed to why or how to implement the program. The optimal amount and type of hedging depends on the nature of a firm’s investment and financing opportunities. This paper take the view that the motivation for risk management is to reduce the variability in cashflows since it disrupts investment and financing activities.6 When cashflows are variable the amount of external financing and/or investment will also be variable. Holding investment fixed requires changing external financing. If the marginal cost of funds increases in the 6
Other theories include managerial risk aversion, information asymmetries in the labor market, taxes, financial distress costs/additional debt capacity, and underinvestment.
140
CHAPTER 5. CORPORATE FINANCE
amount of financing then the investment policy will still be altered. Thus, actions the firm can take to reduce cashflow variability may increase firm value. This is based on the assumption that firms are more efficient at hedging than individuals. An implication of the model is that high R&D firms are more likely to hedge. These firms may have greater difficulty raising external funds because either the growth opportunities are not good collateral or since there may be large information asymmetries. Also, the R&D growth options are not likely to be correlated with hedgeable risks. This effect comes from the distinction between collateral value sensitivities and marginal product sensitivities. Here the marginal product is insensitive to hedgeable risk. Therefore, the firm desires more hedging so it can still fully invest in the bad states. If the marginal product were more sensitive there would be a natural hedge in the sense that when the firm is in a bad state it wants to invest less anyways. Several conclusions arise from the model. • Optimal hedging does not always mean full hedging. • Firms should hedge less when future investment and cashflows are highly correlated and more when collateral and cashflows are correlated. • Hedging by multinationals is influenced by revenue and expense exposures to exchange rates. • Nonlinear hedging allows added precision. • Futures and forwards are different intertemporally. • Hedging practices of competitors matters to a firm. May (1995) May (1995) tests the theory that managerial risk preferences affect the risk management decisions of the firm. The paper focuses on acquisitions, which can be a substitute for other risk management practices. For managers, diversification may be a positive NPV project, even though it may be bad for shareholders. The main finding is that managers with more personal wealth invested in the firm tend to diversify, despite evidence that diversification typically reduces firm value [Berger and Ofek (1995)]. The CEO’s motive are proxied by his tenure, estimated fraction of wealth in equity, specialization of human capital, and past performance. The relation between these variables and the diversification level sought, industryadjusted leverage, volatility, and idiosyncratic risk are considered. Diversifi-
5.12. RISK MANAGEMENT
141
Table 5.9: Preditions and Results in Tufano Hypothesis Distress Disruption of Invest. Cost of Ext. Fin. Tax Risk Aversion Other Fin. Policies
Variable Cash costs Leverage Exploration Acquisitions Firm value Reserves Tax loss CF Mgr. stock Mgr. options Nonmgr. block Diversification Cash
Predicted + + + + – – + + – ? – –
Actual +
+ – – –
cation level sought is measured as the covariance of returns between bidder and target, firm-specific risk reduction, and implied change in volatility. There is strong evidence that the fraction of wealth in equity is important. CEOs with specific expertise tend to buy related targets. Poor past performers often make diversifying acquisitions. There is weak evidence that seasoned experts also make diversifying acquisitions, perhaps because their human capital becomes too firm-specific. Tufano (1996) By focusing on the gold industry Tufano (1996) is able to carefully examine the determinants of risk management. Isolating the gold industry allows a study where there is a common exposure to output price. The wide variety of risk management policies and gold-related derivative instruments used by the industry provides cross-sectional variation. Data collection efforts are aided by the public disclosure of risk management activities. The gold industry should use very little hedging since its assets are mostly tangible and known, investors can hedge on their own relatively easily, and detailed reporting minimizes informational asymmetries. Despite these reasons, 85% of the firms do manage risk.
142
CHAPTER 5. CORPORATE FINANCE
To perform the analysis, Tufano calcualtes a delta percentage (∆%) which is the portfolio delta times the ratio of ounces hedged to expected production. If ∆% = 0 there is no hedge and the firm is long its full production. At ∆% = 1 the firm has a full hedge. A delta percentage less than zero or greater than one indicates a speculative long or short position, respecitvely. The independent variables in the tobit regression are summarized in Table 5.9. He finds support for management incentives, but little support for firm incentives. When managers own more stock options firms manage risk less, but when managers have more wealth invested the firms manage risk more. Other results show that firms with low cash balances or CFO’s with short tenure manage risk more. Geczy, Minton & Schrand (1996) Geczy, Minton, and Schrand (1996) try to explain “Why Firms Use Currency Derivatives.” They test the predictions of hedging theories by looking at a subset of the Fortune 500 firms with ex ante foreign exchange exposure. The study also considers how the magnitude of the exposure affects the benefits from risk reduction and the associated expenses. The results indicate that financing constraints provide incentives for hedging. There is evidence of underinvestment, especially for firms with little financial flexibility. Firms may choose to use foreign-denominated debt as a substitute for direct hedging. The expenses associated with hedging are important. There is no support for speculative positions. Roughly 40% of the firms use currency swaps, forwards, or options. Usage is more common among firms with more growth opportunities or greater financial constraints, consistent with the model of Froot, Scharfstein, and Stein (1993). Larger firms or firms using other derivative instruments are more likely to use currency derivatives, indicating economies of skill and scale. Firms tend to hedge foreign currency with forwards and foreign interest with swaps. The analysis is based on a logit regression predicting currency derivative use. The categories of factors considered are managerial incentives, bondholders, equityholders, operating characteristics, substitues, and costs. An effort is made to account for the endogeneity problem related to a firm’s choices of capital structure, executive compensation, and derivatives usage. Consistent with the argument that more foreign exchange exposure increases the benefits to hedging, the authors find that the likelihood of cur-
5.13. INTERNAL/EXTERNAL MARKETS AND BANKING
143
rency derivatives use is positively related to foreign sales, foreign-denominated debt, and foreign pre-tax income. The positive relation between hedging and R&D and the negative relation with the quick ratio support the claim that firms with the highest external finance costs use currency derivatives.
5.13
Internal/External Markets and Banking
The distinction between internal and external capital markets becomes important when there are market frictions. Fama (1985) claims that banks must provide some unique services since they are effectively taxed by reserve requirements, but the orgainzational form still exists. Possible explanations are an informational advantage, greater capacity to monitor, and a certification/signaling role. Rajan (1996) Rajan (1996) presents a model incorporating the endogenous costs and benefits of bank debt. An optimal borrowing structure reduces a bank’s ability to appropriate rents from the borrower without drastically reducing its ability to control. The main result is that an informed bank can prevent a manager from continuing a negative NPV project, but it comes at a cost of reduced managerial effort and value due to the bank’s bargaining power over positive NPV projects. Arm’s length debt has neither the bargaining power nor the monitoring capacity of bank debt, but demands a higher return ex ante to compensate for the negative NPV projects. In the model an owner-manager needs external financing to pursue a project idea. After making the investment, the manager exerts costly effort which affects the distribution of project returns. The bank has the ability to force discontinuation if the project becomes negative NPV. Since the manager is a residual claimant, he always wants to continue [Jensen and Meckling (1976)]. Note that everyone is risk-neutral in the model. The structure of the bank loan is important. If the bank requires repayment when the true state is revealed, the bank has the power to hold up the manager unless he has other financing options. This causes the owner to lose some of the surplus from the project and he will no longer exert optimal effort. Alternatively, the bank can require repayment only at completion of the project. Now the bank loses its power to force discontinuation and has
144
CHAPTER 5. CORPORATE FINANCE
to bribe the manager to stop negative NPV projects. Competition among financiers has ambigous effects. It reduces the bank’s ability to extract a surplus in the good states, but also reduces its ability to force discontinution since the manager can borrow from uninformed sources. Puri (1996) The purpose of Puri (1996) is to determine whether banks suffered from a conflict of interest when they were allowed to underwrite securities offerings. The Glass-Steagall Act of 1933 prevented banks from underwriting based on the premise that banks had an incentive to underwrite offerings of their own troubled loans. There is a tradeoff between the informational advantage banks have, which should reduce the yield premium, and the conflict of interest, which would raise the premium. The strategy of the paper is to look at yield premiums of commerical banks versus investment banks. The null hypothesis is that the yield premiums are the same for the two types of banks. The sample includes several hundred offerings between 1927 and 1929, the period between the McFadden Act, which made underwriting legal, and the Depression. The main analysis is a regression of yield premium on control variables and a dummy for commercial banks. The control variables include credit quality, loan amount, syndiate size, firm age, and dummy variables for exchange listing, securitization, and new issue. The results suggest that commercial banks did not have a conflict of interest. The yield on bank underwritten issues is lower than that on underwritings by investment banks, especially for the informationally sensitive offerings such as new issues, industrials, preferred, and lower-grade. This indicates the informational effects dominate the conflict of interest and is consistent with positive AR for bank loan announcements. Shin & Stulz (1996) A test of whether divisional structures influence investment policy is the focus of Shin and Stulz (1996). A firm with multiple divisions has several potential costs and benefits of diversification. On the one hand, internal capital markets will provide cheaper access to capital if external markets are imperfect. On the other hand, bureacracy may hamper efficient investment. The basic evidence is that the investment of small divisions depends heavily
5.13. INTERNAL/EXTERNAL MARKETS AND BANKING
145
on the cashflow of larger divisions, but the investment of larger divisions does not depend much on the cashflows of other divisions. This suggests that internal capital markets are important, but does not tell us if they are good or bad. There are three hypotheses under consideration. With bureaucratic rigidity, additional management and inefficient policies and procedures may cause firms to give divisions “sticky” fraction of the total capital budget. One division’s allocation will be inversely related to the cashflows of other divisions. This inverse relation will be stronger with more divisions, and weaker when investment is not expeceted to be sensitive to cashflows. Under the hypothesis of efficient internal capital markets firms will shift funds (including dividends) to the source of highest value. In this setting other divisions will benefit when a large division has high cashflows and relatively poor investment opportunities. Finally, the free cashflow hypothesis says that firms may still shift funds to the best use, but dividends will not be paid. The prediction of this theory is that firms will invest more in non-core segments if the core business has high cashflows and poor prospects. To address these theories the authors examine the link between CF and investment at the division level compared to the entire firm. The link between a division’s investment and the cashflows of the other divisions is also considered. A distinction is made between small and large divisions. The ratio of divisional capital expenditures to lagged divisional assets is regressed on the lagged value of that ratio, divisional sales growth, divisional CF/Assets, and CF/Assets of other segments. These regressions are performed separately for small and large segments, with futher subdivision on the number of segments. The entire anlysis is repeated for large firms. The results indicate that the investment of all divisions are positively related to each of the independent variables. For small divisions, the cashflows of other divisions are fairly important, while this is not the case for larger divisions. With more divisions the importance of other segments increases. For firms where investment is not expected to be sensitive to cashflows (e.g., low leverage or high q7 ), the sensitivity of a division’s investment to other divisions’ cashflows is weaker. This sensitivity increases with the number of divisions. These results are consistent with the bureaucratic rigidity hypothesis. There is little evidence supportive of the efficient internal capital markets or free cashflow hypotheses; firms with large divisions that have 7
Market to Book is actually used as a proxy for Tobin’s q.
146
CHAPTER 5. CORPORATE FINANCE
poor prosepects but large free cashflow do not seem to direct more funds to small divisions in growing industries. Billett, Flannery & Garfinkel (1995) Billett, Flannery, and Garfinkel (1995) attempt to determine whether the quality of the lender has a valuation impact on the borrower. The lender’s identity might matter if certain lenders have special monitoring abilities, or if the lender’s preferences for certain risk classes signal the borrower’s type. Announcement of issuance of public securities is generally met with a price decline. Private securities are often associated with a positive price impact. Therefore, public and private financing do not seem to be perfect substitutes. Institutional features may affect this process. Banks have access to private information in the form of deposit accounts. Government regulations require banks to focus on the risk of individual loans rather than the entire portfolio. Therefore, borrowing from a constrained bank may signal a less risky borrower. The lender’s credit quality may also matter. Borrowers are likely to prefer healthy banks to preserve long-term relationships and minimize search/switching costs. Expertise in monitoring may produce economies to specialization. A high rating will reduce the lender’s cost of capital. A reputational equilibrium may develop where lenders are expected to deliver securities of a certain type. The analysis performs an event study on a sample of firms with loan announcements in the 1980s. Univariate analysis shows that there is not a difference between the abnormal returns when the lender is a bank versus a non-bank. However, borrowers experience a positive abnormal return when borrowing from a bank with a high credit rating, versus a negative abnormal return from lenders with lower ratings. Regression results indicate that abnormal returns increase by 20 basis points for each change in the lender’s credit rating after controlling for other factors such as firm size, preannouncement run-ups, and other firm characteristics. Fazzari, Hubbard & Peterson (1988) Fazzari, Hubbard, and Peterson (1988) test whether financing constraints affect investment. In perfect markets, financing alternatives are perfect substitutes and the investment and financing decisions are separate. Market imperfections make external markets more expensive. Asymmetric informa-
5.14. CONVERTIBLE DEBT
147
tion is the primary friction, others include transactions costs, taxes, agency problems, and financial distress. The paper explores the empirical support for the q theory, sales accelorator model and the neoclassical model of investment. Each of these models predict that factors other than cash flow drive investment. Under the q theory, firms invest as long as the marginal q is greater than unity. The neoclassical theory is based on the notion that the financial characteristics of a firm do not affect the cost of capital. The sales accelerator model says that sales growth drives investment. The basic idea behind the empirical tests is to define three classes of firms based on dividend payouts (retained earnings). These groups are proxies for information asymmetry; high payouts mean the firm has the lowest costs to external financing. Investment per dollar of capital is then regressed on financial measures to see if there are differences across groups. The results indicate that cashflow is important in determining investment, and more so for the firms with low dividend payouts. This supports the pecking order theory.
5.14
Convertible Debt
Convertible debt can be viewed as an indirect equity issuance — when a firm calls its bonds it is like issuing equity. Under a signaling hypothesis, firms tend to issue equity when their shares are overvalued. Many researchers have argued that in perfect markets, convertible bonds should be called as soon as possible to minimize the value of the liability. Early empirical evidence suggests that corporations wait too long to call and there are negative excess returns at the announcement of the call. Subsequent researchers proposed several reasons why a firm may choose to delay the call. This could be due to managerial compensation schemes based on EPS, the effect of reduced bondholder goodwill on future issuances, a preference for voluntary conversion induced by dividend increases, and suboptimal conversion strategies by the security holders. Stein (1992) Stein (1992) develops a theory explaining the use of convertible debt based on the cost of financial distress and the importance of call provisions. Con-
148
CHAPTER 5. CORPORATE FINANCE
vertibles allow a company to get equity into the capital structure “through the back door,” while mitigating the adverse selection costs of a direct equity issuance. Since a convertible issue is like a combination of debt and equity the issuance signals better prospects than an equity issuance. The model is an extension of Myers and Majluf (1984), where there are good, medium, and bad firms that differ in the probability of a high cash flow. The firm knows its type at time zero, while investors get this information at time one. The cashflow is revealed at time two. A good firm is certain to get the high cashflow XH . Medium firms get XH with probabiltity p. Bad firms may improve with probability (1 − z) and have a p% chance at XH , or deteriorate and get nothing. A basic version of the model gives firms the choice of equity, long term debt and convertible debt. When costs of financial distress are sufficiently high (C > I − XL ) there is a separating equilibrium. Good firms choose debt since there are no distress costs and the firm does not have to sell undervalued securities. Medium firms choose convertible debt to reflect the tradeoff between distress costs and issuing undervalued securites. The bad firms choose equity because the distress costs of other securities outweigh the benefits. There are several forms of empirical support for the model. Firms often state the desire to get equity into the capital structure as a reason for issuing convertible securities. Convertible debt is often (and fairly quickly) converted into equity. Convertible issuers tend to have high informational asymmetries and costs of financial distress as indicated by high R&D/Sales, M/B, D/E, and CF volatility. Finally, the stock price reaction to convertible issues is typically half to a third the negative reaction of equity issuances. Ofer & Natarajan (1989) The paper by Ofer and Natarajan (1989) assesses whether the negative share price reaction to a call announcement is due to signaling. There is a decline in performance after the announcement as well as a continued negative CAR over the next five years. Under a signaling framework the announcement of the call will be met with a negative return since investors perceive the call as signaling bad news. For the signal to be effective the firm must perform poorly after the call. The sample consists of over 100 voluntary calls during the 1970s. There is a potential sample selection bias since the pre-announcement performance
5.14. CONVERTIBLE DEBT
149
tends to be abnormally high. After the call, what may be normal performance will look poor in comparison. Other papers which correct for this problem do not find evidence of poor post-announcement performance. The authors use several measures of performance to avoid the causality problem between the call decision and the performance measures. EBIT will not be affected by the conversion. EBT is affected through the reduction in interest. EPS is affected by both the interest and the increase in number of shares. Finally, AEBT is EBT less the interest that would have been paid. Three models of normal performance are used. The first assumes the performance is stationary through time. The second and third models express expected performance as a function of average market- and industry-wide performance. In all cases the results indicate that these firms have unexpectedly poor performance. The call announcement is associated with a negative abnormal return, then followed by negative CARs over the next five years. These results are consistent with the information signaling hypothesis and the predictions of Myers and Majluf (1984). Dunn & Eades (1989) Dunn and Eades (1989) attempt to explain the observation that firms wait too long too call preferred stock by focusing on the assumption that investors follow perfect-market strategies. If enough investors deviate from the perfectmarket strategy then it may be optimal for the firm to delay the call. If the dividend yield on the callable security is lower than on the common stock then managers can take advantage of the slow conversion by passive investors. The optimal call policy for the firm is to force conversion by calling as soon as the conversion value exceeds the call price, but before the issue enters the voluntary conversion region (VCR). The VCR is the first ex-date where the dividend on conversion is greater than the preferred dividend and conversion premium. The study uses convertible preferred stock to avoid complications related to interest tax deductibility. Consistent with the passive investory theory • Many investors do not convert in the VCR • Convertible preferreds sell below conversion values • Firms are generally not able to increase shareholder wealth by calling • Passive investors would typically realize incremental returns by converting The authors define the dividend ratio (DR) as the total conversion dividends
150
CHAPTER 5. CORPORATE FINANCE
relative to the total preferred dividends in a year. The price ratio (P R) is the average ratio of preferred price to conversion value of equity. The share ratio (SR) is the fraction of preferred shares remaining at the end of the year after conversion. When DR < 1 then P R > 1 indicating that the preferred sells at a premium due to the conversion option and dividend advantage. When DR > 1 then P R = 1 since there is no conversion premium. The SR drops to around 80% prior to entering the VCR, drops to around 50% in the next year, then declines to roughly 10% ten years after entering the VCR. Consistent with the theory, callable survivors have higher φC/Call, lower SR, higher DR, and lower P R than the called sample. The called sample also has a higher proportion of issues in the VCR. Regression results show that before entering the VCR, conversions increase when preferred is selling below its conversion value. After entering the conversion region, investors are increasingly motivated to convert as the dividend advantage of common stock increases. Using institutional ownership as a proxy for active investors, there is some weak evidence that institutional investors reduce their holdings more than other investors. Asquith & Mullins (1991) Asquith and Mullins (1991) explain why companies do not call convertible debt when the conversion value exceeds the call price, as predicted by many theories. There are three primary criteria used to explain this behavior. The first, and most obvious, is simply that the issues are still call-protected. Second, the firms may want the conversion value to be somewhat higher than the call price to provide protection from a price decline during the call notice period. Finally, the most powerful explanation is that there may be cashflow advantages to the firm from not calling when the after-tax interest after corporate taxes is less than the dividends. An analysis of convertible bonds with conversion values in excess of par indicates that 89% fall into one of the above categories. 21 of the remaining 22 are close to or subsequently meet the requirements for one of these groups. Voluntary conversion is more likely with higher conversion value or higher dividends relative to after-tax interest. An increase in conversion value decreases the option value. Investors voluntarily convert when investors get more cash in dividends, a time when firms have an incentive not to call. This is supported by the data since less than 20% of the issues remain when
5.15. IMPERFECTIONS AND DEMAND
151
converted dividends exceed the interest. Although the investor’s problem is the inverse of the firm’s, the decisions are not symmetric because of taxes. Therefore there are bonds which a firm will not call and investors do not convert. Asquith (1995) Asquith (1995) corrects prior studies by showing that, when measured properly, there is no call delay. Prior studies draw the conclusion that conversion value in excess of call value indicates a delay from the optimal time to call. A number of these bonds are still call-protected. Many of those that are not protected have the after-tax yield below the dividend, providing a cashflow incentive not to convert. Finally, delayed conversion bonds often have relatively low premia or volatile cashflows, providing a price protection justification for the delay. These motivations are discussed in Asquith and Mullins (1991). This paper adds an analysis of the delay between when a bond is callable and when it is called. The paper finds that those bonds that are called have fewer “live” days. Bonds with relatively high conversion prices and those with D < I(1 − τ ) are called more quickly. A puzzle is that there are several bonds with D > I(1−τ ) that are called. The general conclusion is that most bonds are called as soon as possible unless there are cashflow advantages to delaying. The median call delay for all bonds is four months, but less than one month if a price cushion is considered. Asquith argues that call premiums are not a useful method of detecting whether bonds are called late. Overall, the average call premium is 50%. The average call premium drops to 25% after considering factors such as cashflow motivated delays, sudden stock price increases, and large premiums while call protected.
5.15
Imperfections and Demand
In perfect markets demand curves should be flat but market imperfections may cause downward sloping demand curves. Many important propositions in finance are based on the assumption that investors can buy or sell stock without changing the price. Observed price reactions indicate prices are sensitive to volumes. Large block purchases generally result in price increases, while sales cause prices to fall. With equity issuance there are negative price
152
CHAPTER 5. CORPORATE FINANCE
reactions, potentially due to agency costs of free cashflow [Jensen (1986)], asymmetric information [Myers and Majluf (1984)], and signaling [Miller and Rock (1985)]. In takeovers bidder prices typically fall while targets receive a premium. Convertible debt and the call announcements are associated with negative market reactions. It is not clear if these reactions are driven by signaling, liquidity, or downward sloping demand curves. Shleifer (1986) Shleifer (1986) provides evidence that demand curves for stocks do slope down. He uses inclusion in the S&P 500 as a sample since this event increases demand for the stock without contaminating information effects. Earlier studies had examined the price effects of large block trades but these events may be based on information. A possible certification role of index membership is refuted since the returns are unrelated to bond ratings. The liquidity hypothesis is rejected by finding no difference in the returns of Fortune 500 firms and other firms. There is no evidence that the market is able to predict inclusion in the index. Before daily notification of the inclusion there is no abnormal return on the event day. Since 1976 there has been a daily notification service of changes in the index. In this period inclusion in the index is associated with a positive abnormal return of about 2.8%. This return lasts for several weeks and seems to be related to buying by index funds. Other evidence supports the downward sloping demand curve hypothesis as well. The price reaction to large block trades typically only lasts a few hours. Firms with multiple classes of stock that issue more of one class generally experience a price drop only for that class of stock [Loderer, Cooney, and VanDrunen (1991)]. A downward sloping demand curve is also consistent with the January effect. Shleifer & Vishny (1992) Shleifer and Vishny (1992) relate the costs of asset sales to leverage in a general equilibrium setting. When a firm is in financial distress, the most ideal purchasers of the assets are likely to be in financial distress themselves. This liquidity cost is recognized ex ante as a cost of leverage. The main result is that more liquid assets are able to support more debt. This is broadly consistent with Myers (1977). The intuition behind the model is that assets are often specialized, making
5.15. IMPERFECTIONS AND DEMAND
153
them most valuable to firms within the industry. When industry shocks send a firm into financial distress its competitors will also be affected. As a result, there is an industry debt capacity and the leverage of one firm will depend on the leverage of its peers. Firms outside the industry may have an interest in the assets but are likely to pay less. Outsiders fear overpaying since they lack the expertise to properly value the assets, they may lack the knowledge or skills to fully utilize the assets, and they face agency costs in hiring experts to help them. There are several empirical implications of the model. Liquid assets should be financed with more debt. Cyclical and growth oriented assets are likely to have lower debt financing. Ceteris paribus, smaller firms should be able to support more debt since they can more easily be purchased. Conglomerates should also be able to use more debt since the divisions can crosssubsidize each other. High markets are likely to be liquid markets. The takeover wave of the 1980s is consistent with this theory. Corporate cashflows were large as were the number of potential buyers. Antitrust enforcement was relaxed, allowing more intra-industry acquisitions. This increased liquidity and the rise of the junk bond market reinforced each other. Merton (1987) Merton (1987) is an asset pricing model which relaxes the assumption of homogeneous information. Although the model is cast as one with imperfect information, it can be interpreted as a model of incomplete markets. Investors are unable to fully diversify so they demand a premium for bearing this undiversifiable unsystematic risk. In this one period model risk-averse investors know about a subset of the securities in n risky firms. There is also a riskless asset and another asset that combines the riskless security with a forward contract. The market is absent frictions from taxes, transactions costs, and restrictions on borrowing. If all investors had complete information sets the model reduces to the standard SL CAPM, otherwise the market portfolio is not mean-variance efficient. Information costs come in the form of gathering and processing data, transmitting information, and most impotantly, making investors aware of the firm. The return generating process is ˜k = R ¯ k + bk Y˜ + σk ε˜k . R
154
CHAPTER 5. CORPORATE FINANCE
¯ k , bk , σk }. All informed An investor is informed about asset k if he knows {R investors have conditionally homogenous beliefs. This structure is similar to the single asset model of Grossman and Stiglitz (1980), but here there is no gaming between the informed and uninformed because investors only invest in securities in which they are infomed. The shadow cost of not knowing about an asset is the same for all uninformed investors and is equal to the expected excess return on the asset. The equilibrium expected return is ¯ k = R + bk bδ + δxk σ 2 /qk . R k This equation shows the expected return decreases when the investor base increases. There model makes several predictions. A large common-factor exposure (bk ), large size (xk ), or large variance (σk2 ) create high expected returns. When the firm is well-known or has a large investor base (qk ) the expected return is smaller. This may give rise to a size effect. These effects can give rise to downward-sloping demand curves. Expansion of the firm’s investor base and increases in investment will tend to coincide, giving a motivation for an underwritten offer instead of a rights offer. Managers have an incentive to expand the investor base, especially for relatively unknown firms and those with large firm-specific variances. This can explain why firms advertise their stock and invest in generating interest in the firm by the financial press. The model is also consistent with IPO waves in gereral and concentration within an industry. Kadlec & McConnell (1994) Kadlec and McConnell (1994) use exchange listing to test the predictions of the Merton (1987) model of investor recognition and the Amihud & Mendelson (1986) model of liquidity factors. In the former model expected returns decrease as the size of the investor base grows. In the latter, expected returns decrease with a reduction in the relative bid-ask spread. If the expected return decreases then the market value should increase and abnormal returns should be positive. During the 1980s, announcement of NYSE listing results in an abnormal return of 5 to 6%. The listing is also associated with a 19% increase in the number of shareholders, a 27% increase in institutional ownership, a 5% reduction in absolute bid-ask spreads, and a 7% reduction in relative spreads.
5.16. FINANCIAL INNOVATION
155
The results are consistent with both models. The proxy for Merton’s shadow cost of incomplete information is the inverse of the change in investor base scaled by the level of firm-specific risk and market value. Controlling for the change in bid-ask spread, an increase in investor base results in a positive abnormal return. Controlling for change in investor base, a decrease in the spread is associated with higher abnormal returns. Loderer, Cooney & VanDrunen (1991) Loderer, Cooney, and VanDrunen (1991) isolate and identify the potential influence of price elasticity on demand using the price discount from SEOs by regulated firms. Regulated firms are used because they are more likely to have preferred stock and less likely to have information asymmetries. If the stock issuance announcement contains negative information it there should be a neagative reaction for preferred stock as well. The evidence supports the incomplete markets theory of Merton (1987), but is inconclusive with respect to theories of liquidity or heterogeneous beliefs. To estimate the determinants of elasticity, IN V ELAS 8 is regressed on variance (–), size (–), investor base/liquidity (+), and proxies for information effects (+). To capture information effects the authors consider ∆E[EP S], ∆EP S, ∆ROE, and the price change of nonconvertible preferred stock at the announcement. The results are significant and consistent with predictions for all variables except liquidity and information, which are insignificant. These results are robust to a number of different specifications and proxy variables. A potential caveat is the predictability of issuance by regulated firms may make it difficult to detect information effects.
5.16
Financial Innovation
When there are market imperfections there may be structures of claims that has special value. Just as the prior section dealt with imperfections and asset demands, this section addresses the effects on the supply of securities. Topics covered here include optimal financial contracts, the incentives to innovate, and the existence of clienteles. 8
This is the inverse of elasticity. The inverse introduces a nonlinearity in the model that may result in mis-specification.
156
CHAPTER 5. CORPORATE FINANCE
Zender (1991) Zender (1991) develops a model of the optimal financing contract that incorporates both cashflow and control allocations. Most existing theories focus only on cashflows. The optimal financial instruments completely resolve incentive problems induced by asymmetric information. In the setting of the paper, standard debt and equity contracts are optimal. Bankruptcy broadens the investment opportunity set and facilitates cooperation between the parties. In the model there are three agents: an entrepreneur, an active owner, and a passive owner. The owners are risk-neutral and have limited capital. At t = 0 contracts are designed and sold and the initial investment I0 is made. At t = 1 information about CF3 is made public. The firm receives CF1 and assignment of t = 2 controls are made. At t = 2 the firms has an investment opportunity which the controlling owner knows but the public only knows the distribution. The investment requires an investment I that is unobservable to outsiders. At t = 3 the investment payoff is realized and the firm is liquidated. There is disagreement among the agents about investment/dividend policy due to the passive investor’s inability to observe investment expenditures. The agents realize up front that risk-shifting may occur and they mitigate it by inducing a state-contingent control change when an observable signal is realized. Cashflows to debt must be fixed in order to provide the equityholder incentives to make efficient investments. This can explain the use of debt before tax shields. Tufano (1989) Tufano (1989) examines “innovative” investment banks and the benefits from innovation. He finds that innovators gererally do not charge monopoly prices (underwriting spread). Instead, they charge lower long-run prices and gain market share. One interpretation is that innovation can reduce costs of trading, underwriting, and marketing. To identify the importance of price as a source of first-mover advantage underwriting spreads are regressed on measures of competitiveness and underwriter identity and control vartiables for offering characteristics. A dummy variable for the monopoly period is insignificant for all offers and negative for imitated products. Permanent price effects as measured by a
5.16. FINANCIAL INNOVATION
157
pioneer dummy are reliably negative. The long-run quantity effects appear to be an important source of firstmover advantage. Pioneers capture market share nearly 2.5 times as large as imitators. Temporary quantity effects due to periods of monopoly are not important since the number of deals is small and imitators are quick to follow.
Kim & Stulz (1988) Kim and Stulz (1988) directly test the clientele hypothesis, which says that firm value can be increased by seeking funding from groups with unique demands. The evidence is consistent with this hypothesis. The authors focus on Eurobonds from U.S. firms that also issue domestic debt. Eurobonds are geneally bearer bonds allowing the holder to escape taxes. There are some questions over the enforceability of the bond indenture, so reputation replaces restrictive covenants. Foreign investors may desire these securities because they offer diversification yet have smaller purchasing power and political risks. This market is characterized by larger underwriting spreads. If the supply of Eurobonds is not perfectly elastic then excess demand can create profitable financing opportunities since investors will accept lower yields. The supply of these securities is somewhat constrained because of the high issuance costs, low risk requirement, and reputational capital required. The results indicate there are positive abnormal returns at the announcement of Eurobond issues. A comparison sample of domestic debt issues shows no significant announcement effect, as in Mikkelson and Partch (1986). The positive abnormal returns occur mostly during the 1979–1982 period of bought-deal underwriting when yield spreads were large. This type of arrangement reduced the time it takes to issue Eurobonds. The positive abnormal returns diminished in subsequent years when shelf registration increased the attractiveness of domestic issues. Abnormal returns were indistinguishable from zero when tax laws ended the withholding tax for foreign investors in domestic bonds. The clientele hypothesis is tested by regressing abnormal returns on the size of the financing bargain. They find a slope coefficient different from zero but not different from one, consistent with the clientele hypothesis.
158
CHAPTER 5. CORPORATE FINANCE
Jung, Kim & Stulz (1996) The paper by Jung, Kim, and Stulz (1996) finds that some firms appear to issue equity for the benefit of managers rather than shareholder. See Section 5.9.6 for a more complete discussion of this paper. McConnell & Schwartz (1992) McConnell and Schwartz (1992) describe the process leading up to the development of the Liquid Yield Option Note (LYON) by Merrill Lynch in 1985. This is a zero-coupon, callable, convertible, putable instrument. This instrument is designed to reduce the transactions costs associated with a strategy of investing in options and the money market. These investors desire a risky investment paying interest but preserving the principal, much like portfolio insurance. The value of the security is relatively insensitive to the risk of the company, reducing the cost of information asymmetries. There is a self-selection by firms since only those with the most confidence in their prosepects will issue. When pricing the instrument it is important to consider the interaction/covariance between the various components.
Chapter 6 Market Microstructure 6.1
Introduction
Information economics deals with incorporating information into asset prices. Market microstructure is the study of the process and outcomes of exchanging assets under explicit trading rules. The focus is often on the interaction between the mechanics of the trading process and its outcomes, with specific emphasis on how actual markets and intermediaries behave. Randomness is an important part of any of these models. The source of the randomness has implications for the characteristics of the model. In all cases there is uncertainty about future outcomes or cashflows. Informed agents have imperfect infomation about the future value of the asset. This information may be the same for all informed agents, or they may each have diverse signals. Some models include uinformed agents whose demands depend on price. Noise trading is an additional source of uncertainty that introduces uncertainty about the net demands for the asset and prevents fully revealing prices. A major difference in the models is whether trades are processed in a batch or sequentially. The latter allows dynamics in the price process and facilitates analysis of the bid-ask spread. The literature is fragmentated in the view on the risk preferences of specialists. There are several important idiosyncracies in early papers that much of the subsequent work tries to solve. The first is a paradox where agents ignore their private information when prices are fully revealing. If this is true, then how does the private information get into prices in the first place. 159
160
CHAPTER 6. MARKET MICROSTRUCTURE
A second paradox arises when private information is costly and prices are fully revealing. If so, then there is no incentive for collection of private information, and this private information will then never become impounded in the prices. Finally, there is the schizophrenia result where rational agents in a competitive market act as price takers, ignoring the impact their trades will have on the price. The first problems are solved by introducing noise trading. Allowing imperfect competition solves the last problem.
6.2
The Value of Information
Hirshleifer (1971) Hirshleifer (1971) analyzes the private and social value to private information in a context of uncertain personal productivity. A distinction is made between foreknowledge, knowing something in advance of its occurance, and discovery, recognition of something (that may have already occurred) which is not readily observable. Hirshleifer argues that there is no social value to foreknowledge without production. This is because information is valuable only if it can affect actions. Under the assumptions in the paper, agents have the same endowments, preferences, and beliefs so there is no incentive to trade. If the informed agent could speculate the information would be privately valuable. In a production economy, foreknowledge is both privately and socially valuable. This is because production can be shifted to the optimal channels based on this information. The informed agent can sell his information so that the economy can fully use it in redirecting productive activities. This has implications for the timing of information releases. Announcements at regularly scheduled intervals allow risk-averse agents to insure before the news announcement to get out of the way. Random releases of news as it occurs allow more efficient reallocation of production, but expose the agents to distributive risk. The same general results obtain with discovery information. Marshall (1974) Marshall (1974) shows that information can be socially valuable even in a pure exchange economy if agents have heterogeneous priors. With homogeneous beliefs, private information has no social value if it can be hedged and
6.3. SINGLE PERIOD REE
161
it reduces value if it can not be hedged. In a production economy information is socially valuable with sufficient hedges. Marshall says that there is an overincentive to produce private information.
6.3
Single Period REE
Prices reflect traders’ information in a securities market. In a Rational Expectations Equilibrium (REE) traders with heterogeneous information attempt to infer the information of others from the prices, and then use this information to revise their beliefs. There are several problems with the REE concept. Unless noise is added, prices are typically fully revealing [Grossman (1976)]. Fully revealing prices preclude speculative trading on the basis of heterogeneous beliefs, giving the “no trade” result of Tirole.1 If traders are allowed to condition on trades as well as prices, then these data are sufficient statistics for all information and there is no advantage to being informed. Finally, without restricting the distribution of information, there is no trading mechanism that could implement an REE. Milgrom & Stokey (1982) Milgrom and Stokey (1982) is a base-case for the information content of trades. The model imposes very restrictive assumptions such as complete markets and concordant beliefs. Agents are risk-averse. Under these conditions, a “no trade” result obtains. Once ex ante trading occurs to a Pareto optimal level, no future trading will take place although prices may change. This is because anyone willing to trade must have private information. Other agents will realize this and will be unwilling to trade since they all interpret information in the same way.
1
The no trade result of Milgrom and Stokey (1982) will obtain with homogeneous beliefs and a Pareto optimal allocation.
162
CHAPTER 6. MARKET MICROSTRUCTURE
Grossman (1976) The Grossman (1976) paper deals with the price system as an aggregator of diverse information. If private signals are identically distributed, then the price reveals the average of all agents’ information and private information is redundant given the price. The REE is identical to a Walrasian equilibrium in an artificial economy where agents share their information before trading. With complete markets, equilibirum allocations are ex post Pareto efficient. In this model, agents have CARA utility so there are no wealth effects, but in a REE there are information effects. A price change affects the desirability of an asset. The model specifies informed trader i knows yi = p 1 + ε i . The resulting price is p0 (y1 . . . yN ). Prices reflect each agent’s private information but do not depend on preferences. This results in a paradox: individuals ignore their own information in favor of the aggregated information, but if they do ignore their private information, how does it get into prices? The result that prices perfectly aggregate information is not robust to the addition of noise, but another paradox remains. If markets are “perfect” and information collection is costly, then there is no incentive to collect information. The agents in this model are “schizophrenic” in that their actions affect price but they take price as given in determining their demand. Grossman & Stiglitz (1980) Grossman and Stiglitz (1980) say that informationally efficient markets can not exist. If private information is costly, but has no value, then there is no incentive to collect it. This paper differs from Grossman (1976) in that it is a model of asymmetric information rather than diverse information. It endogenously derives the allocation of pretrading information, whereas most other papers take it as exogeneous. The model is based on perfect competition, one-shot trading, and a Walrasian auctioneer. The return on the risky asset is u = θ + ε. An agent can pay c to realize θ. Informed agents receive the same signal and all agents have negative exponential utility with risk aversion parameter a.
6.3. SINGLE PERIOD REE
Table 6.1: Summary of Key Models Paper MM Inf. Uninf.a Noiseb Comments Milgrom and Stokey (1982) MAC — No Grossman (1976) MAC — No p i = P + εi Grossman and Stiglitz (1980) MAC MAC Yes r =θ+ε Hellwig (1980) MAC — Yes diverse info Diamond and Verrecchia (1981) MAC — Yes diverse info, noise in endowments Admati (1985) MAC — Yes multiple assets Kyle (1989) MAU MAU Yes i n = v + en Kyle (1985) SNC SNU — Yes dynamic model Admati and Pfleiderer (1988) SNC MNC M Yes Admati and Pfleiderer (1989) MN MNU M Yes Foster and Viswanathan (1990) SC SU C Yes Slezak (1994) MAC MAC Yes multi-period generalization of GS Amihud and Mendelson (1980) SNU M — Yes spread = cost, is MM is C? Glosten and Milgrom (1985) NC MN — Yes Glosten (1989) SNU MA — Yes all traders have liq. and info. Rock (1989) SA MA MN Inf. = Mkt. orders, Uninf. = limit Codes in Table: S: Single, M: Multiple; A: Risk-averse, N: Risk-neutral; C: Competitive, U: Uncompetitive. a Uninformed traders whose demands depend on price. b Noise generally refers to liquidity traders, whose demands do not depend on price.
163
164
CHAPTER 6. MARKET MICROSTRUCTURE
The informed and uninformed have demands XI () =
θ − Rp aσε2
and XU () =
E[˜ u|P˜ () = p] − Rp . avar(˜ u|P˜ = p)
For markets to clear λXI + (1 − λ)XU = x. In Grossman (1976) there is a paradox since agents ignore their own information, yet prices perfectly aggregate this information. A solution is to introduce noise in the form of liquidity traders. Now prices are not fully revealing, private information still has value, and trading based on common beliefs is possible. With dynamic trading the market maker can break even on average, not on every trade [see Glosten (1989)]. If competition is imperfect, the equilibrium price reveals less information, although the price is determined as if a nontrading auctioneer aggregated demand curves. If a market maker replaces the auctioneer, one needs to ask what services the market maker is providing. Many papers take the position that the market maker is an information processor. The informational component of the spread is proportional to the probability of trading with an informed agent and also proportional to the informed trader’s expected profit from holding the asset. Spreads will be larger for larger quantities. Hellwig (1980) Hellwig (1980) attempts to avoid the schizophrenic agents in Grossman (1976) by enlarging the economy. The model takes the limit of the incorrect economy, rather than fixing the problem. In other words, this solution is essentially “at the limit” rather than “in the limit,” leaving open the question of how an economy becomes large in the first place. The paper is still important in that it shows the schizophrenia problem may be small when the economy is large. The model is basically an extension of Grossman (1976), but with the addition of noise in the supply of the risky asset. The amount of information also grows with the size of the economy [Kyle (1989) holds it fixed]. In a finiteagent economy when the noise is small, the price becomes fully revealing, as in Grossman. Upon enlarging the economy, the prices do not fully reflect the information of the informed agents. Individuals find their own information to
6.3. SINGLE PERIOD REE
165
be incrementally informative to the price alone. The strength of an agent’s reaction to his signal is inversely related to his risk aversion and the noisiness of his signal. Diamond & Verrecchia (1981) The Diamond and Verrecchia (1981) model of a competitive market yields prices which partially aggregate diverse information to form prices which are not fully revealing. Prices deviate from the “efficient” level by a random amount. Noise is explicitly modeled as random endowments in the risky asset. If individual endowments are iid, per capita supply is constant in the limit and the model approaches the Grossman (1976) fully revealing model. If the variance of individual endowments grows with the population, the limit is Hellwig (1980). Admati (1985) Admati (1985) is a multisecurity version of Hellwig (1980). Investment decisions are based on MV considerations, but each agent in effect uses a different model since they condition on different information. These conditional models do not natually aggregate to imply similar unconditional models. Therefore, the market is geneally not MV efficient for any particular information set, including all public infomation. Uncertainty about the supply of one asset may prevent the prices of other assets from being fully revealing. This may represent a solution to the Grossman and Stiglitz (1980) paradox. The correlations among the assets can result in a number of strange results. Price may be decreasing in the profitability of an asset or increasing in its supply. The predicted payoff of an asset may be decreasing in price. Finally, assets may increase in price with greater demand. Kyle (1989) The Kyle (1989) paper solves the schizophrenia problem by allowing imperfect competition. The model uses noise traders, uninformed traders, and mulitple informed speculators in a static model. A Walrasian auctioneer accepts limit orders. The informed speculators receive independent, normally distributed noisy signals of the asset value. Traders have negative exponential (CARA) utility.
166
CHAPTER 6. MARKET MICROSTRUCTURE
The value of the asset is given by v˜ with variance τv−1 . Noise traders have random demands z˜ with variance σz2 . There are N informed agents with information ˜in = v˜ + e˜n where var(˜ en ) = τe−1 . There is a symmetric linear equilibrium with informed demands Xn (p, in ) = µI + βin − γI p and uninformed demands Xm (p) = µU − γU p. If all information could be combined the precision of the forecast would be τF = var−1 (˜ v |˜i1 , . . . , ˜iN ) = τU + N τe . The precision for the informed and uninformed are τI = var−1 (˜ v |˜ p, ˜in ) = τv + τe + ψI (N − 1)τe and τU = −1 var (˜ v |˜ p) = τv + ψU N τe . The terms ψI and ψU represent the fraction of information available to the type of agent. When ψ = 0 prices are uninformative and when ψ = 1 prices are fully revealing. Expressions for these terms are N β2 ψU = N β 2 + σz2 τe
(N − 1)β 2 and ψI = . (N − 1)β 2 + σz2 τe
The results of the model are prices that are less revealing than in the perfect competition case. The uninformed breakeven on average and the informed profit at the expense of the noise traders. An increase in the number of uninformed or a decrease in their risk aversion ρU increases the information effect. As M → ∞, E[˜ v |p] = p, a martingale result. An increase in the number of informed or a decrease in per capita noise trading increases ψI . In the limiting economy as N → ∞, τe = τE /N where τE is fixed. The uninformed do not trade (??). As the infomed become risk-neutral, prices become fully revealing in the competitive case, but only half as much in the imperfect competetion case. Endogenizing information acquisition overcomes the schizophrenia problem. This equilibrium is different from the competitive outcome since informed traders now take into account the effect of their actions on the market price. In this case, traders must know the pricing function, the number of other traders, and all other agents’ demand schedules. By accounting for their impact on price, traders no longer completely trade away their informational advantage.
6.4
Batch Models
This section begins with the analysis of market orders in the Kyle (1985) model. A market maker observes the net order flow and sets a single price
6.4. BATCH MODELS
167
at which all orders are cleared. Without price-contingent orders, it is not possible to explore the bid-ask spread or transaction prices. This framework does allow analysis of the effect informed traders’ strategies have on prices. Kyle was the first to develop a model of this nature. Price-contingent orders are taken up in Kyle (1989). The strategic action of uniformed agents are covered in models such as Admati and Pfleiderer (1988), Admati and Pfleiderer (1989), and Foster and Viswanathan (1990). Kyle (1985) The classic Kyle (1985) model uses a single risk-neutral informed trader, a group of noise traders, and a single risk-neutral market maker. The model is dynamic, allowing an analysis of trading strategies over time. The model is presented first in a single period setting. The random future asset value is v˜, which only the informed trader can observe. The market maker does not explicitly know v, but knows v˜ ∼ N (p0 , Σ0 ). The uninformed traders provide noise in the aggregate order flow (˜ x+u ˜), thereby preventing the market maker from perfectly inferring v. These noise traders submit orders for u ˜ ∼ N (0, σu2 ). The informed trader, who has rational expectations, knows the pricing function and the distribution of noise trades. He chooses an order quantity to maximize his expected profits X(v) = argmax E[Π(X(·), P (·))|v]. The informed trader does not know the price at which his order will be filled. The equilibrium2 is the pair P (·) and X(·). The market maker sets price equal to the expected value of v conditional on observing x + u. P (x + u) = E[˜ v |x + u]. The equilibrium is X(˜ v ) = β(˜ v − p0 ) P (˜ x + u˜) = p0 + λ(˜ x+u ˜) 2
This setup is not game-theoretic, but can be made so by including additional market makers with identical information, or by giving the market maker an objective function. The equilibrium then is such that each player’s strategy is a best response given his information at each stage in the game.
168
CHAPTER 6. MARKET MICROSTRUCTURE
p p where β = σu2 /Σ0 and λ = 12 Σ0 /σu2 . The market maker can use his knowledge of X(·) to observe a random variable ∼ N (v, σu2 /β 2 ). Using Bayes rule, his posterior on v is N (p0 + λ(x + u), Σ0 /2). To derive the equilibrium, suppose that P and X can be expressed as linear functions of µ, λ, α, and β P (y) = µ + λy
and X(v) = α + βv.
The expected profit for the informed agent given his signal is E[Π|tildev = v] = E [˜ v − P (x + u ˜)]x|˜ v = v = (v − µ − λx)x.
Profit maximization gives the FOC v − µ − 2λx = 0, or X(v) = α + βv with α = −µβ and β = 1/(2λ). The market efficiency condition is µ + λy = E[˜ v |α + β˜ v+u ˜ = y]. Normality makes the regression linear. Applying the projection theorem gives λ=
βΣ0 cov(v, y) = 2 var(y) β Σ0 + σu2
and µ − p0 = −λ(α + βp0 ). Solving, we get µ = p0 and α = −βp0 . Several characterizations can be made. The unconditional expeceted profit to the informed is ˜ = E E[Π|v] ˜ E[Π] = E[(v − p0 − λx)x] = E[β(1 − λβ)(v − p0 )2 ] 1 1 1 = βΣ0 = (Σ0 σu2 ) 2 . 2 2 The variance of the value conditional on the price is var(˜ v |p) = var(v − p0 − λ(α + βv + u)) = E [v − p0 − λ(α + βv + u)]2 = E [(v − p0 )(1 − λβ) − λ(α + u)]2 = E[(v − p0 )(1 − λβ) + λ2 u2 ] 1 = Σ0 /4 + λ2 σu2 = Σ0 2
6.4. BATCH MODELS
169
Note that the noise traders have an expected loss, which can be justified with liquidity trading arguments. The noise traders’ loss is the informed trader’s gain. The market maker expects to break even on average by balancing his loss to the informed with the gain from trading with the uninformed. In the discrete time sequential auction there is a unique linear equilibrium. There are constants βn , λn , αn , δn , and Σn such that ∆˜ xn = βn (˜ v − p˜n−1 )∆tn ∆˜ pn = λn (∆˜ xn + ∆˜ un ) Σn = var(˜ v |∆˜ x1 + ∆˜ u1 , . . . , ∆˜ xn + ∆˜ un ) E[˜ πn |p1 , . . . , pn−1 , v] = αn−1 (v − pn−1 )2 + δn−1 Given Σ0 , the constants are a unique solution to a difference equation system αn−1 =
1 4λn (1 − αn λn )
δn−1 = δn + αn λ2n σu2 ∆tn
β n ∆n =
1 − 2αn λn 2λn (1 − αn λn )
λn = βn Σn /σu2 Σn = (1 − βn λn ∆tn )Σn−1 The derivation of the above results follows three steps. First, solve for the informed agent’s trading strategy as a function of the price function. Second, find the price function that is consistent with market efficiency given optimal trades. Finally, show the difference equation system implied by the first two steps has a solution.
170
CHAPTER 6. MARKET MICROSTRUCTURE
In a continuous time setting, µ(t) follows a Brownian motion. Therefore, the uninformed quantity is independent through time. Since this independence will not be true for the informed trader, there is a linkage between quantity and information that causes prices to (eventually) reflect all information. The informed trader need not trade the same amount every period. He changes his trade size to try to “hide” from the market maker. The prices have a constant volatility as information is gradually incorporated into prices at a constant rate. Prices follow a martingale (and a random walk), so they are efficient in the semi-strong sense. The informed trader profits more by continuously trading than by using a mixed strategy attempting to manipulate prices. You could not tell that there is an informed trader by looking at prices alone. The continuous time setting makes it possible to spread information quickly without removing the incentives to acquire information [Grossman and Stiglitz (1980)]. The speed with which the informed trader pushes prices to the true value measures resiliency. This speed is the difference between his private information and the current price, divided by the remaining trading time. The depth of the market, constant over time, is proportional to the amount of noise trading and is inversely proportional to the amount of private information. The market is infinitely tight in continuous time. There are many extensions to the model. You could let the market maker know more about the distribution of orders than the market as a whole. This drastically reduces the informed traders ability to make profits and prices reflect information much more quickly. Another extension allows multiple informed traders. Foster and Viswanathan (1990) is an example of this, where the normality assumption is relaxed to elliptical distributions. The result is the competition between informed forces prices to their full-information levels almost immediately, eliminating the smoothing behavior. Their work also shows that the Kyle results may be sensitive to the normality assumption. Kyle (1989) uses a more complex trading mechanism (limit orders) in a single period setting to overcome this problem.
6.4.1
Strategic Uninformed Traders
Strategic uninformed trading may allow these agents to reduce their trading losses. This may create price effects by the uninformed traders, as they attempt to “hide” from the informed traders. Admati and Pfleiderer (1988) and Admati and Pfleiderer (1989) examine intraday timing decisions of the
6.4. BATCH MODELS
171
uninformed. Foster and Viswanathan (1990) focus on interday effects as the levels of public and private information vary across days. Admati & Pfleiderer (1988) There is empirical evidence of U-shaped patterns in intraday volume and volatility. In Admati and Pfleiderer (1988) the risk-neutral3 informed traders get their information one period before it becomes public knowledge. The informed then just decide the optimal order size in each period. The uninformed discretionary traders can not split trades, but they do decide when to trade. There are also nondiscretionary traders providing noise in the model. The competitive informed traders do not consider the price consequences of their actions. Uninformed traders end up clustering their trades. This clustering can improve the liquidity of the market and reduce their losses to the informed. The informed traders recognize the clumping of uninformed trades and will also trade during these periods, intensifying the clustering. The concentration of discretionary liquidity traders does not affect the amount of information revealed by prices or the variance of price changes if the number of informed traders is fixed. This is because there is an increase in informed trading just enough to keep the informational content the same. Endogenizing information acquisition intensifies the concentration of trading and prices become more informative. The liquidity traders are better off with no informed traders, but if there are any the cost of trading decreases with the number of informed. A critical assumption is the independence of trade between periods. Subsequent prices will not reflect previous order flow. If the uninformed are allowed to split their trades, an equilibrium may not exist, and if it does it may not be unique. The results are also sensitive to the assumptions about the risk preferences of the informed traders. If they are risk-averse, then it may not be the case that periods with more informed traders result in better prices for the uninformed. Thus, the clumping may not hold if traders are risk-averse. If uniformed trade flows become more informative over time, uninformed traders will be more likely to trade early. Admati & Pfleiderer (1989) Admati and Pfleiderer (1989) examine patterns in mean returns in a frame3
The results do not change with risk-averse liquidity traders.
172
CHAPTER 6. MARKET MICROSTRUCTURE
work where market makers reduce the adverse selection problem by inducing patterns in volume and price. By changing the bid or ask commission, the market maker can change the expected number of liquidity sellers and buyers. The market maker’s expected loss to the informed decreases with the commission, but so does his expected profits on the discretionary liquidity traders. The market maker processes trades in a manner combining some features of batch and sequential trading. Traders do know the prices at which they will transact, but prices are updated after every period in time, not after every trade. Equilibrium trading results in all discretionary buying occuring in a single period, and similarly for selling. This is because the liquidity trading reduces the adverse selection problem. The paper uses a market where traders can only buy on even days and sell on odd days as an example. Foster & Viswanathan (1990) In Foster and Viswanathan (1990) an interday pattern in trading arises because the informational advantage of the informed decreases over time as the uninformed infer information from the price and the market maker from the order flow. The informed trader will be at the greatest advantage when the market first opens, such as in the morning or on Monday. The model is an extension of Kyle (1985). There is only one informed trader and the uninformed act competitively. The ability of the uninformed to choose when to trade creates the temporal pattern. ?? The sensitivity of the price to the order flow increases with the amount of information released by the informed and falls with the amount of liquidity trading. The informed trades more when there are more liquidity traders to hide his trade. Consequently, he has a higher profit when there is more liquidity trading, or when he releases more private information. The release of private information will be smooth throughout the day. Slezak (1994) Slezak (1994) develops a multiperiod generalization of Grossman and Stiglitz (1980) that produces patterns in both the mean and variance of returns without relying on irrationality, bubbles, or strategic liquidity trading. These patterns arise because of the effect market closures have on the information structure in the economy.
6.5. SEQUENTIAL TRADE MODELS
173
The model uses risk-averse agents. Market closures alter investor uncertainty by changing the timing of resolution of uncertainty and by reducing the informed agent’s comparative advantage at risk bearing. Closures affect the variance of returns by altering the informativeness of the price. Post closure prices reflect a greater proportion of private news on the reopening day, but less private news accumulated over the closure. Preclosure prices are relatively less informative as well. Post closure liquidity costs are higher since increased adverse selection causes the uninformed to provide less liquidity.
6.5
Sequential Trade Models
Sequential trade models allow for the analysis of bid-ask spreads and the details of the price process. The main underlying idea is that an informed trader will prefer to buy when the price is low and sell when it is high. The market maker will lose money on him if there is a single price. By introducing a bid-ask spread the market maker can offset the losses to the informed with gains from the uninformed.
6.5.1
Specialists and Dealers
Amihud & Mendelson (1980) In Amihud and Mendelson (1980) the (risk-averse ??) market maker maximizes expected profits by changing the bid and ask. This can give rise to an asymmetric bid ask as the market maker manages his inventory. This contrasts with Admati and Pfleiderer (1989) where an asymmetric spread results from information effects. In the model the market maker is a monopolist who sets bid and ask prices (pb and pa ) to maximize expected profits. The quotes are good for a single transaction. The arrival of buy and sell orders is Poisson with rates D(pa ) and S(pb ) where D 0 < 0, S 0 > 0. The market maker dislikes extreme inventory positions because they force him to take transactions under unfavorable conditions. To stay in his desired inventory range the market maker adjusts bid and ask prices to manage his inventory. Glosten & Milgrom (1985) Glosten and Milgrom (1985) model the market maker’s pricing decision in
174
CHAPTER 6. MARKET MICROSTRUCTURE
an environment where he learns from previous trades. In a competitive market, informed trades will reflect their information. A sell order will lower the market maker’s expectation, while a buyer will raise his expectation. The competitive market maker sets the bid and ask such that his expected profit on any trade is zero. The bid reflects the expected value of an asset conditional on a sell order arriving. Bayes rule updates the conditional probabilities as trades occur. Since the distribution of trades differs depending on the true state, the market maker will eventually learn the informed trader’s information. Many of the results stem from the fact that the informed can only trade a single unit at a point in time. Prices follow a martingale with respect to the specialist’s and public information; price changes will be serially uncorrelated. Spreads due to adverse selection are different from spreads arising from transactions costs, risk aversion, or monopoly power. These other sources of spreads will lead to negative serial correlation. The spread can be expressed as Ψ + 2c, where Ψ is the adverse selection cost and c is the cost of transacting. The covariance of adjacent price changes is − 12 Ψc − c2 . This is similar to Roll (1984), with the addition of the adverse selection component. The variance of a price change is θ 2 + (Ψ/2)2 + cΨ + 2c2 , where θ 2 is the variance of public information arriving exogenously between trades. The bid-ask spread reflects the informational asymmetry. With large volume the spread will be small. As the market maker learns the insider’s information the valuation of the informed trader and the market maker converge. The spread will increase when the informed traders’ information is better, insiders become relatively more numerous, or the elasticity of the supply and demand of uninformed traders increases. If the adverse selection problem is too large, then the market may collapse as in Akerlof (1970). If the market closes for this reason the problem only gets worse. Once a market closes it will stay closed until the information asymmetry is reduced. Glosten (1989) When investors trade on private information it can lead to suboptimal risk sharing if the market maker reduces the liquidity of the market. Glosten (1989) looks at whether the monopoly power of the specialist can preserve market liquidity and avoid market failure. In Glosten and Milgrom (1985) the market maker is competitive. He sets the price to have a zero expected profit on every trade. When information
6.5. SEQUENTIAL TRADE MODELS
175
asymmetries are large the market may fail completely. Furthermore, market closure generally makes the information asymmetry worse. By giving the market market monopoly power4 he can set prices to average profits across trades. He will lose on trades with the informed, but compensate with trades to the uninformed. The result is increased liquidity. The market maker is risk-neutral so there are no inventory costs associated with risk bearing. The model also ignores any dynamic trading.
Rock (1989) Rock (1989) examines the interaction of the specialists order book and prices. risk-neutral uninformed traders submit limit orders. A risk-averse market maker competes with the orders in the book. The market maker has two advantages. First, he knows the size of the trade. Second, he moves second so he can get out of the way of big trades and fill them from the book, creating an adverse selection problem. The book orders tend to only get the unprofitable trades. Limit orders provide liquidity to the market. These orders have an option component to them. The order is an obligation to buy or sell at the specified price. Since the order submitters are writing the option, they receive an option premium in the form of reduced transactions costs. These investors avoid the adverse selection component of the bid-ask spread by standing ready to transact ahead of time. The assumptions about risk preferences are important in this model. If the specialist was risk-neutral he would not need to bother with the order book. It is the risk neutrality of the limit order submitters that gives them a comparative advantage at risk bearing. If the limit order submitters were risk-averse they would only submit orders in response to inventory positions, etc. The risk aversion of the specialist will cause him to take transactions at prices that may differ from underlying value. 4
The market maker need not literally be a monopolist. He has superior information about the trading process from his order book, but may face competition from limit orders and other floor traders. Limit orders allow traders to provide liquidity to the market and compete with the market maker. What is important is his ability to average profits across trades.
176
6.5.2
CHAPTER 6. MARKET MICROSTRUCTURE
Other Topics
Trading Volume Volume of trade generally increases with the precision of private information. Equilibrium beliefs are not always more homogeneous if information is more precise. Sale of Information People with private information can profit from it by selling it or by trading on it themselves. The more selling they do, the less valuable the information is in their trading. The information seller can add noise to the information (either the same noise for each purchaser or unique noises). Selling can also be done indirectly, as in a mutual fund. Regulation The adverse selection component of trading costs is like a tax on noise traders that subsidizes the acquisition of private information and its release through the price system. Regulators can attempt to influence the liquidity of markets and the informativeness of prices. Attempts to reduce noise trading on the grounds that it destabilizes prices may not work. It is noise trading that attracts informed traders to the market in the first place. Reducing noise trading may actually reduce the informativeness of prices.
6.6 6.6.1
Special Topics Bubbles
Bubbles deal with deviations from fundamental value. Shiller (1981) is one of the classic papers in this area. Refer to Section 2.8.6 for more information. Tirole () has a no trade result in a dynamic context where trade does not occur because it would burst a bubble. Blanchard & Watson (1982) Blanchard and Watson (1982) argue that rational bubbles are possible even in efficient markets. The market price can be expressed as the fundamental
6.6. SPECIAL TOPICS
177
value plus a bubble pt = p∗t + ct where E[ct |Ωt−1 ] = (1 + r)ct−1 . A deterministic bubble is given by ct = c0 (1 + r)t . The bubble grows with time so that it eventually dominates the fundamental value portion of the price. Since the growth must continue forever for the price to be rational, this type of bubble is implausible. A stochastic bubble is created by adding a random shock to the to a deterministic bubble. A stochastic crash takes a value of ct = µt + ct−1 (1 + r)/π with probability π and ct = µt otherwise. In this case E[µt |Ωt−1 ] = 0. This produces a situation where the bubble will persist with probabilty π or crash. The average return is greater than r to compensate for the risk of a crash. Arbitrage does not eliminate these bubbles. Since the bubble grows in any of the above cases, as the time horizon becomes infinite the bubble will be infinitely large. Since some assets, such as bonds, have finite lives the bubble must be zero at their maturity. Therefore bubbles are ruled out for these securities. The structure above also rules out negative bubbles since they imply negative security prices with a positive probability. Empirically detecting bubbles is challenging. To use the price process to say something intersting about bubbles requires an understanding of the fundamental value process — including the information sets available. Tests for bubbles can be divided into variance bounds and patterns in innovations. The variance bounds tests, such as Shiller (1981) put upper bounds on the conditional or unconditional variances of prices relative to the variance of dividends. The innovation patterns tests look for either runs in shocks or extreme outliers.
6.6.2
Speculation
Hart & Kreps (1986) Hart and Kreps (1986) show that, contrary to common belief, speculation can destabilize prices. Speculators buy when the chances of price appreciation are high, which is not necessarily when prices are actually low.
178
6.6.3
CHAPTER 6. MARKET MICROSTRUCTURE
Noise
DeLong, Shleifer, Summers, & Waldman (1990) ? develop an overlapping generations model with irrational noise traders. The rational investors do not fully exploit the irrational investors. Their short-run focus prevents them from completely wiping out the irrational investors. “Noise trader risk” is the chance that marketwide irrational beliefs of the noise traders may become even more irrational before reverting to their mean. Essentially the noise trader beliefs are slowly mean reverting. If an arbitrageur has a limited investment horizon there is a chance that the prices will not return to their true value before he has to close out his position. In fact, if the beliefs become more irrational the arbitrageur may face a loss. There are several plausible preditions from the model. Prices are more volatile with noise trading. If the noise traders’ opinions are stationary there will be a mean-reverting component to stock returns. Assets may be underpriced realtive to fundamental value, consistent with the equity premium puzzle.
6.6.4
Cascades
Bikhchandani, Hirshleifer, & Welch (1992) Bikhchandani, Hirshleifer, and Welch (1992) generalize the idea of IPO cascades in Welch (1992). For the details of cascades refer to the discussion of the original paper in Section 5.10.1. An information cascade describes a sequence of decisions where individuals ignore their own private information in favor of information inferred from the observation of others decisions. Cascades can be reversed by the release of new information.
Chapter 7 International Finance 7.1
Introduction
What distinguishes international finance from traditional finance is the addition of foreign exchange rate assets, both spot and forward. There are several measures of returns in international finance. The return from currency speculation by buying forward and selling spot is (ft − st+1 )/st . Mean returns are generally close to zero. Depreciation is defined as (st+1 − st )/st . The forward premium is (ft − st )/st . There are several basic concepts that are important in international finance. Covered Interest Rate Parity dictates that exp(rti ) = exp(rtj )
Ftij Stij
or rti − rtj = ftij − sij t
to prevent arbitrage. Uncovered Interest Rate Parity states that exp(rti )
=
ij j E[St+1 ] exp(rt ) Stij
ij or rti − rtj = E[sij t+1 ] − st
In terms of notation, the ij superscript indicates the price of a unit of currency j in terms of currency i. Purchasing Power Parity (PPP) is another no arbitrage condition that says the prices of a good in different countries must be the same after converting currencies. Floating rates began in 1973. The period shortly thereafter is known as the “dirty float” period. 179
180
CHAPTER 7. INTERNATIONAL FINANCE
There are two puzzles in international finance. The first is the deviation of the forward rate from the expected future spot rate. This captures the difference between covered and uncovered interest parity. The second puzzle is the home country bias — too little investment in foreign assets.
7.2
Spot Currency Pricing
Lucas (1982) The Lucas (1982) model extends Lucas (1978) to international asset pricing. In the model there are two infinitely-lived countries with identical agents. There are two non-storable goods, no production, stochastic endowment shocks, and monetary instability. The model is developed first in a barter economy, then in a world with a single currency, and finally with national currencies and flexible exchange rates. Country 0 produces good X in amounts {ξt }. Similarly, country 1 produces good Y in amounts {ηt }. Denote the price of Y , in units of X, in state s as pY (s). The prices of the future streams {ξt } and {ηt } are given by qX (s) and qY (s), again in units of X. An agent with wealth θ chooses consuption of (X, Y ) at prices (1, pY (s)) and shares (θX , θY ) of ({ξt }, {ηt }) at prices (qX (s), qY (s)). The agents objective is to "∞ # X max E β t U (Xit , Yit ) t=0
subject to a budget constraint and a cash in advance contraint. This means that the value of current period endowments can not be used in trading for assets or the other consumption good until next period. Each agent can be viewed as a two member household. One member collects the endowment and exchanges it for currency while the other uses existing currency to trade assets and goods. The two members do not interact until the end of the trading period. With national currencies, the monetary shocks are given by ∆Mt+1 = w0,t+1 Mt
and ∆Nt+1 = w1,t+1 Nt .
Within each country the price of the home good in terms of home currency is pX (s, M ) = M/ξ
and pY (s, N ) = N/η.
7.3. FORWARD CURRENCY PRICING Also note the price of Y in terms of X can be expressed as pY (s) = The exchange rate (currency 0 per unit of currency 1) is e(s, M, N ) =
181 ∂U/∂Y . ∂U/∂X
M/ξ πY pX (s, M ) pY (s) = pY (s) = pY (s) pY (s, N ) N/η πX
where πi = 1/pi (s, ·) gives the purchasing power for country i. In equilibrium, Vi (t)πi (t)
∂U (t + 1) ∂U (t) =E β πi (t + 1)[Vi (t + 1) + Di (t + 1)] . ∂i ∂i
For a riskless asset
∂U (t + 1)/∂i π(t + 1) = E[m]. Bi (t) = E β ∂U (t)/∂i π(t)
7.3
Forward Currency Pricing
Hansen & Hodrick (1983) Hansen and Hodrick (1983) study the determinants of the risk premium in foreign exchange rates. This premium arises when the forward rate is not equal to the expected future spot rate, Ft 6= E[SSt+1 ]. The basic idea is to test the orthogonality condition j E[Qm,t+k (sjt+k − ft,k )] = 0
where Qm,t+k is the IMRS of money. To make the above condition testable the authors propose three models. The first is a lognormal model which implies a constant risk premium. The second uses a riskless nominal rate and assumes a constant conditional covariance. The third is a latent variable model. The first two models are rejected, while the third provides some evidence that the risk premium is important. These test are joint tests of the orthogonality condition and the auxillary restrictions in each of the three models. Fama (1984) Fama (1984a) uses the same basic framework Fama (1984b), which looks at Treasury bills. The idea here is to determine the information in the forward
182
CHAPTER 7. INTERNATIONAL FINANCE
premium about forecast errors and changes in the spot rate. This research shows that excess returns are not only predictable ex ante, but also that the variance of the predictable component exceeds the variance of the expected rate change. The analysis begins with a specification for the components of the forward rate ft = E[st+1 ] + pt where the lower case letters indicate logs and pt is the premium. This can be modified to represent the forward premium, which is then used to predict the forecast error and spot rate innovation ft − st = E[st+1 − st ] + pt ft − st+1 = α1 + β1 (ft − st ) + ε1,t+1 st+1 − st = α2 + β2 (ft − st ) + ε2,t+1 . Adding the last two equations implies that α1 + α2 = 0, β1 + β1 = 1, and ε1,t+1 + ε2,t+1 = 0. Fama finds that both components vary through time, but most of the variation in the forward rate is due to the premium. The null hypothesis is β1 = 0 and β2 = 1. The estimated coefficients are β1 > 1 and β2 < 0. Thus, the premium and expected future spot rate are negatively correlated. Possible explanations for these findings can be categorized as either a risk premium story or some type of forecast errors. The risk premium can arise in either a CAPM or a dynamic gereral equilibrium setting if investors have rational expectations. While a risk premium could account for non-zero excess returns, it does not explain the high variablility. Explanations based on forecast errors may rely on either rational or irrational agents. Examples of cases with rational investors include learning models and the peso problem. Mark (1988) Mark (1988) allows time-variation in beta or the risk premium in a single beta CAPM to attempt to explain the forward premium puzzle. The conditional beta comes from an ARCH model. Using GMM, he fails to reject the model, indicating that there is evidence of time-varying beta. Additional tests reject the hypothesis of a constant beta.
7.3. FORWARD CURRENCY PRICING
183
Froot & Frankel (1989) Froot and Frankel (1989) use survey data to extend the analysis in Fama (1984a). They focus on the regression st − st−1 = α + β(ft−1 − st−1 ) + εt and decompose beta into β = 1 − βre − βrp . The term βre captures failure of rational expectations while βrp represents the risk premium. The priors are that βrp is large and βre small, but the authors find the opposite. The risk premium does not appear to be an economically important source of the forward premium. The authors fail to reject the hypothesis that all the bias in the forward premium is due to expectation errors. Contrary to Fama (1984a), Froot and Frankel find that the variance of expected depreciation is large relative to the variance of the risk premium and the risk premium is uncorrelated with the forward discount. This analysis does not incorporate learning effects or the “peso problem.” Backus, Gregory & Telmer (1993) Backus, Gregory, and Telmer (1993) view the evidence on forward premiums in the same light as the equity premium puzzle. They introduce habit persistance to get around the high risk aversion implied by models with representative agents and time-seperable utility. The model is tested using GMM estimation and simulations. The statistical properties of forward and spot rates imply predicatable returns from speculation. These returns are highly variable and imply a highly variable pricing kernel. Using GMM, the authors reject models with power utility and a particular specification of habit persistance. Simulations are used to place more structure on the theory. The evidence is partially consistent with the revised theory. Huang (1989) Huang (1989) examines the risk-return characteristics of the term structure of forward FX. The analysis is much like Hansen and Hodrick (1983), but in a multiple maturity setting. The evidence is that there appear to be some country-specific effects in the short (1 month) end of the term structure. In particular, Huang rejects the model using one month forwards, contrary to
184
CHAPTER 7. INTERNATIONAL FINANCE
Hansen and Hodrick. With 3, 6, and 12 month forwards and with multiple maturities he fails to reject. These results are important since virtually all other papers in the literature (at least the ones mentioned here) use one month forwards. If there are strange influences on this maturity then the results in other papers may not be robust.
7.4
Integration
Bekaert and Harvey (1995) develop a conditional regime-switching model where expected returns are a weighted average of returns in integrated and segmented markets E[ri ] = φλcov(ri , rW ) + (1 − φ)λvar(ri )
where φ is the probability the market is integrated. This probability is estimated with regime switching models assuming constant or time-varying transition probabilties. The authors find evidence of a time-varying world price of risk related to the business cycle (the Sharpe ratio is high in a trough). There is also evidence of time-varying integration for a number of countries. Evans & Lewis (1995) Evans and Lewis (1995) study whether long swings in the dollar can affect risk premium estimates. Using a regime-switching model they find that long swings make risk premium to appear to contain a permanent disturbance and can bias the Fama-style forward regressions. The authors are also unable to reject the restriction that the actual forward premium equals the risk premium plus the expected change in the exchange rate.
7.5
International Asset Pricing
Skip. Papers: Stulz (1981), Bansal, Hsieh, & Viswanathan (1993), Dumas & Solnik (1995), Ferson & Harvey (1993).
7.6
Other Topics
Skip. Papers: Bekaert & Hodrick (1992), Engle & Hamilton (1990), Engle, Ito, & Lin (1990).
Chapter 8 Appendix: Math Results 8.1
Basics
8.1.1
Norms
A norm measures the magnitude of a vector. The Euclidean norm is the common measure. hX i1/2 √ ||x|| ≡ x0 x ≡ x2i
8.1.2
Moments
Moments describe the characteristics of a distribution. The ithi moment is h 0 i th µi = E[x ] and the i central moment is µi = E (x − E[x])i . The first moment is the mean and the second central moment the variance.
8.1.3
Distributions
Normal If x ∼ N (µ, σ 2 ) the density and characteristic functions are (x − µ)2 2 −1/2 f (x) = (2πσ ) exp − 2σ 2 σ 2 t2 φ(t) = exp iµt − 2
185
186
CHAPTER 8. APPENDIX: MATH RESULTS
Lognormal If x is normally distributed, then z = ex is lognormal (its log is normal). (ln z − µ)2 1 . f (z) = √ exp − 2σ 2 σz 2π z¯ = exp µ + σ 2 /2 var(z) = exp(2µ + σ 2 )(exp(σ 2 ) − 1)
8.1.4
Convergence
Probability A sequence of random variables xn converges in probability to a constant c if lim Pr[|xn − c| < δ] = 1 ∀ δ > 0.
n→∞
Distribution A sequence of random variables xn with cdf Fi converges in distribution to a random variable x with cdf F if lim Fn (x) = F (x)
n→∞
Almost Sure A sequence of random variables xn defined on a probability space (Ω, F, P ) converges almost surely to an rv x if lim xn (ω) = x(ω)
n→∞
for each ω ∈ Ω except for ω ∈ E where P (E) = 0. Quadratic Mean
8.1.5
Some Famous Inequalities
Jensen’s Inequality If G is concave in x, then E[G(x)] ≤ G[E(x)].
This is where risk aversion comes from.
8.1. BASICS
187
Chebychev’s Inequality If mean µ and variance σ exist, then for all ε > 0 Pr[|˜ x − µ| ≥ ε] ≤ σ/ε2 Cauchy-Schwarz Inequality (E[xy])2 ≤ E[x2 ]E[y 2 ]
8.1.6
Stein’s Lemma
If (x, y) ∼ N (·, ·), g is everywhere differentiable, and E[|g 0 (x)|] < ∞, then cov(g(x), y) = E[g 0 (x)]cov(x, y). This result is useful in working with the fundamental valuation equation 1 = E[mR]. It can linearize a model under normality.
8.1.7
Bayes Law
Bayes law is useful for updating probabilities. Prob(Xi |Y ) =
8.1.8
Prob(Y |Xi )Prob(Xi ) Prob(Xi Y ) = PN Prob(Y ) i=1 Prob(Y |Xi )Prob(Xi )
Law of Iterated Expectations
The law of iterated expectations is useful in conditioning down on a finer information set. If E[|Y |] < ∞ and F0 ⊂ F1 ⊂ F , then E[Y |F0 ] = E E[Y |F1 ]|F0 .
8.1.9
Stochastic Dominance
To compare two risky payoffs c˜1 and c˜2 , we can use the notion of stochastic dominance. The idea is to choose the asset Let F (c) = Pr[˜ c1 ≤ c], G(c) = Pr[˜ c2 ≤ c]. First order SD: 1 dominates 2 in the first-order sense if F (c) ≤ G(c) ∀ c. RSecond order SD: 1 dominates 2 Rc c in the second-order sense if −∞ F (r)dr ≤ −∞ G(r)dr
188
8.2
CHAPTER 8. APPENDIX: MATH RESULTS
Econometrics
This is a very brief review of some of the highlights from econometrics that are not immediately obvious.
8.2.1
Projection Theorem
If E[y|x] = α + βx then cov(x, y) βˆ = var(x)
8.2.2
and α ˆ = y¯ − βˆx¯.
Cramer-Rao Bound and the Var-Cov Matrix
The Cramer-Rao Bound gives the minimum variance of an estimator. Estimators that achieve the bound are most efficient in their class. Under regularity conditions, the variance of an unbiased estimator θˆn is bounded by var(θˆn ) ≥ var(G)−1 = −E[H]−1 where G and H are the gradiant and Hessian.
8.2.3
Testing: Wald, LM, LR
There are three basic tests of hypotheses, the Wald (W), likelihood ratio (LR), and the Lagrange multiplier. All three are asymtotically χ2 , but finite sample properties may differ. One test may be preferred over the other depnding on the easy of calculation under the null or alternative hypotheses. Consider a ML estimate with g(y, θ) = ln[f (y, θ)] the log-likelihood. Let G = gθ and H = gθθ0 . Then E[G] = 0, var(G) = E[GG0 ] = −E[H] = I and a ˆ → var(θ) I(θ)−1 . W =(θˆ − θ)0 [var(θˆ − θ)]−1 (θˆ − θ) LM =G(θˆR )0 I(θˆR )G(θˆR ) LR = − 2[g(θˆR ) − g(θˆU )]
8.3. CONTINUOUS-TIME MATH G
189
LR g c LM
8.3
W
Continuous-Time Math
8.3.1
Stochastic Processes
8.3.2
Martingales
prices follow a martingale when adjusted for dividends. Random Walk
8.3.3
Itˆ o’s Lemma
Consider the diffusion process of a variable X: dX(t) = µ(X, t)dt + σ(X, t)dW (t) where dW is a standard diffusion process with the properties E[dW ] = 0 and E[dW 2 ] = dt. Then the function F (X, t) has the stochastic differential equation ∂F ∂F 1 2 ∂2F dF (X, t) = dX + + σ (X, t) dt ∂X ∂t 2 ∂X 2
8.3.4
Cameron-Martin-Girsanov Theorem
If Wt is a P-Brownian motion and γt is an F -previsible process satisfying the boundedness condition Z 1 T 2 P γ dt)] < ∞, E [exp( 2 0 t then there exists a measure Q such that
190
CHAPTER 8. APPENDIX: MATH RESULTS
1. Q is equivalent to P 2.
dQ dP
R R T 1 T 2 = exp − 0 γt dWt − 2 0 γt dt
˜ t = Wt 3. W
RT 0
γs ds is a Q-Brownian motion.
There is a converse as well.
8.3.5
Special Processes
Arithmetic Brownian Motion dX = µdt + σdW X grows linearly with increasing uncertainty. √ X is normally distributed with mean X + µ(τ ) and standard deviation σ τ . Geometic Brownian Motion dX = µXdt + σXdW X grows exponentially at rate µ with volatility proportional to the level of X. The distribution of X is lognormal which makes it useful in modeling asset prices. Mean-reverting Process dX = κ(µ − X)dt + σX γ dW If γ = 1/2 then X is distributed non-central χ2 . It is often used to model interest rates, inflation, and volatility; the CIR model is an example of the square root process. If γ = 1, this is called a Ornstein–Uhlenbeck process.
8.3. CONTINUOUS-TIME MATH
8.3.6
191
Special Lemma
If x ∼ N (0, Ω) y
with
σx2 σxy Ω= σxy σy2
then 1 1 E[(Ax exp(x − σx2 − Ay exp(y − σy2 )+ ] = Ax N (d1) − Ay N (d2) 2 2 where d1 =
ln(Ax /Ay ) − Σ √ , Σ
and Σ = var(x − y) = σx2 + σy2 − 2σxy .
d2 = d1 −
√ Σ,
192
CHAPTER 8. APPENDIX: MATH RESULTS
Bibliography Admati, Anat, 1985, A noisy rational expectations equilibrium for multiasset securities markets, Econometrica 53, 629–657. , and Paul Pfleiderer, 1988, A theory of intraday patterns: Volume and price variablity, Review of Financial Studies 1, 3–40. , 1989, Divide and conquer: A theory of intraday and day-of-the-week mean effects, Review of Financial Studies 2, 189–223. Akerlof, George A., 1970, The market for “lemons”: Quality uncertainty and the market mechanism, Quarterly Journal of Economics 84, 488–500. Amihud, Y., and H. Mendelson, 1980, Dealership markets: Market-making with inventory, Journal of Financial Economics 8, 31–53. Asquith, Paul, 1995, Convertible bonds are not called late, Journal of Finance 50, 1275–1289. , and David Mullins, 1991, Convertible debt: Corporate call policy and voluntary conversion, Journal of Finance 46, 1273–1289. Backus, David, Allan Gregory, and Chris Telmer, 1993, Accounting for forward rates in markets for foreign currency, Journal of Finace 48, 1887– 1908. Banz, R., 1981, The relation between the return and market value of common stocks, Jounrnal of Financial Economics 9, 3–18. Basu, S., 1977, The investment perfomance of common stocks in relation to their price to earnings ratios: A test of the efficient markets hypothesis, Journal of Finance 32, 663–682. 193
194
BIBLIOGRAPHY
Bekaert, Geert, and Campbell Harvey, 1995, Time-varying world market integration, Journal of Finance 50, 403–444. Berger, Phillip, and Eli Ofek, 1995, Diversification’s effect on firm value, Journal of Financial Economics 37, 39–65. Berk, Jonathan, 1995, A critique of size related anomalies, Review of Financial Studies 8, 275–286. Betker, Brian, 1995, An empirical examination of pre-packaged bankruptcy, Financial Management 24, 3–18. Bhattacharya, Suipto, and George Constantinides, 1989, Frontiers of Modern Financial Theory . , vol. I & II of Studies in Financial Economics (Rowman & Littlefield: Totowa, NJ). Bikhchandani, S., David Hirshleifer, and Ivo Welch, 1992, A theory of fads, fashion, custom, and cultural change as informational cascades, Journal of Political Economy 100, 992–1025. Billett, Matthew, Mark Flannery, and Jon Garfinkel, 1995, The effect of lender identity on a borrowing firm’s equity return, Jounrnal of Finance 50, 699–718. Bizjak, John, James Brickley, and Jeffrey Coles, 1993, Stock-based incentive compensation and investment behavior, Journal of Accounting and Economics 16, 349–372. Black, Fisher, 1972, Capital market equilibrium with restricted borrowing, Journal of Business 45, 444–455. , 1976, The dividend puzzle, Journal of Portfolio Management 2, 5–8. Black, Fischer, Michael Jensen, and Myron Scholes, 1972, The capital asset pricing model: Some empirical tests, in Michael Jensen, ed.: Studies in the Theory of Capital Markets (Praeger: New York, NY). Black, Fischer, and Myron Scholes, 1973, The pricing of options and corporate liabilities, Journal of Political Economy 81, 637–659.
BIBLIOGRAPHY
195
Blanchard, O., and Mark Watson, 1982, Bubbles, Rational Expectations, and Financial Markets . , vol. Crises in the Economic and Financial Structure (Lexington Books: Lexington, MA). Blume, M., and I. Friend, 1973, A new look at the capital asset pricing model, Journal of Finance 28, 19–33. Booth, James, and Lena Chua, 1996, Ownership dispersion, costly information, and IPO underpricing, Journal of Financial Economics 41, 291–310. Breeden, Douglas T., 1979, An intertemporal asset pricing model with stochastic consumption and investment opportunities, Journal of Financial Economics 7, 265–96. Brown, Roger, and Stephen Schaefer, 1994, The term structure of real interest rates and the Cox, Ingersoll, and Ross model., Journal of Financial Economics 35, 3–42. Brown, Stephen, and Philip Dybvig, 1986, The empirical implications of the Cox, Ingersoll, and Ross theory of the term structure of interest rates, Journal of Finance 41, 617–632. Campbell, John Y., Andrew W. Lo, and A. Craig MacKinlay, 1997, The Econometrics of Financial Markets (Princeton University Press: Princeton, NJ). Chan, K.C., Nai-fu Chen, and David Hsieh, 1984, An exploratory investigation of the firm size effect, Journal of Financial Economics 14, 451–471. Chan, K.C., G. Andrew Karolyi, Francis Longstaff, and Anthony Sanders, 1992, An empirical comparison of alternative models of the short-term interest rate, Journal of Finance 47, 1209–1227. Chen, Nai-fu, Richard Roll, and Stephen A. Ross, 1986, Economic forces and the stock market, Journal of Business 59, 383–403. Cochrane, John, 1998, Asset pricing, Unpublished Book. Diamond, Douglas, and Robert Verrecchia, 1981, Information aggregation in a noisy rational expectations economy, Journal of Financial Economics 9, 221–235.
196
BIBLIOGRAPHY
Dunn, Kenneth, and Kenneth Eades, 1989, Voluntary conversion of convertible securities and the optimal call strategy, Journal of Financial Economics 23, 273–301. Eades, Kenneth, Patrick Hess, and E. Han Kim, 1994, Time-series variation in dividend pricing, Journal of Finance 49, 1617–1638. Eckbo, Espen, and Ronald Masulis, 1992, Adverse selection and the rights offer paradox, Journal of Financial Economics 32, 293–332. Evans, Martin, and Karen Lewis, 1995, Do long-swings in the dollar affect estimates of the risk premia?, Review of Financial Studies 8, 709–742. Fama, Eugene, 1980, Agency problems and the theory of the firm, Journal of Politcial Economy 88, 288–307. , 1984a, Forward and spot exchange rates, Journal of Monetary Economics 14, 319–338. , 1984b, The information in the term structure, Journal of Financial Economics 13, 509–521. , 1991, Efficient capital markets: II, Journal of Finance 46, 1575– 1618. , and Kenneth French, 1992, The cross-section of expected stock returns, Journal of Finance 47, 427–465. , 1996b, The CAPM is wanted, dead or alive, Journal of Finance. Fama, Eugene F., and James MacBeth, 1973, Risk, return and equilibrium: Empirical tests, Journal of Political Economy 81, 607–636. Fazzari, Steven, Glenn Hubbard, and Bruce Peterson, 1988, Financing constraints and corporate investment, Brookings Papers on Economic Activities 1, 141–195. Foster, Douglass, and S. Viswanathan, 1990, A theory of interday variations in volume, variance and trading costs in securities markets, Review of Financial Studies 3, 593–624.
BIBLIOGRAPHY
197
Froot, Kenneth, and Jeffrey Frankel, 1989, Forward discount bias: Is it an exchange rate risk premium?, Quarterly Journal of Economics Feb., 139– 161. Froot, Kenneth, David Scharfstein, and Jeremy Stein, 1993, Risk management: Coordinating corporate investment and financing policies, Journal of Finance 48, 1629–1658. Geczy, Christopher, Bernadette Minton, and Catherine Schrand, 1996, Why firms use currency derivatives, Working paper. Gibbons, Michael, 1982, Multivariate tests of financial models: A new approach, Journal of Financial Economics 10, 3–27. , and Krishna Ramaswamy, 1993, A test of the Cox, Ingersoll, and Ross model of the term structure, Review of Financial Studies 6, 619–658. Glosten, Larry, 1989, Insider trading, liquidity, and the role of the monopolist specialist, Journal of Business 62, 211–235. , and P. Milgrom, 1985, Bid, ask, and transaction prices in a specialist market with heterogeneously informed traders, Journal of Financial Economics 14, 71–100. Graham, John, 1996, Debt and the marginal tax rate, Journal of Financial Economics 41, 41–73. Grossman, Sanford, 1976, On the efficiency of competitive stock markets where trades have diverse information, Journal of Finance 31, 573–585. , and J. E. Stiglitz, 1980, On the impossibility of informationally efficient markets, American Economic Review 70, 393–408. Hansen, Lars Peter, and Robert J. Hodrick, 1983, Risk Averse Speculation in the Forward Foreign Exchange Market: An Econometric Analysis of Linear Models, vol. Exchange Rates and International Macroeconomics . pp. 113–152 (University of Chicago Press: Chicago). Hansen, Lars Peter, and Ravi Jagannathan, 1991, Implications of securities market data for models of dynamic economies, Journal of Political Economy 99, 225–262.
198
BIBLIOGRAPHY
Hansen, Lars Peter, and S.F.R Richard, 1987, The role of conditioning information in deducting testable restrictions implied by dynamic asset pricing models, Econometrica 55, 587–613. Harrison, J., and David Kreps, 1979, Martingales and arbitrage in multiperiod securities markets, Journal of Economic Theory 20, 381–408. Hart, Oliver, and David Kreps, 1986, Price destabilizing speculation, Journal of Political Economy 94, 927–952. Hellwig, M.F., 1980, On the aggregation of information in competitive markets, Journal of Economic Theory 22, 477–498. Helwege, Jean, and Nelie Liang, 1996, Is there a pecking order? evidence from a panel of IPO firms, Journal of Financial Economics 40, 429–458. Hirshleifer, Jack, 1971, The private and social value of information and the reward to inventive activity, American Economic Review 61, 561–574. Hotchkiss, Edith Shwalb, 1995, Postbankruptcy resolution: Direct costs and violation of priority claims, Journal of Finance 50, 3–21. Huang, Chi-fu, and Robert H. Litzenberger, 1988, Foundations for Financial Economics (Prentice-Hall: Englewood Cliffs, NJ). Huang, Roger, 1989, An analysis of intertemporal pricing for forward foreign exchange contracts, Journal of Finance 44, 183–194. Ibottson, Roger, and Jay Ritter, 1995, Initial Public Offerings, vol. NorthHolland Handbooks of Operations Research and Management Science: Finance . pp. 993–1016 (North-Holland: Amsterdam). Ingersoll, 1987, Theory of Financial Decision Making . Studies in Financial Economics (Roman & Littlefield: Savage, MD). Ingersoll, Jonathan, 1984, Some results in the theory of arbitrage pricing, Journal of Finance 39, 1021–1039. James, Christopher, 1995, When do banks take equity in debt restructurings, Review of Financial Studies 8.
BIBLIOGRAPHY
199
Jarrow, Robert, V. Maksimovic, and W. T. Ziemba, 1995, Finance . , vol. 9 of Handbooks in Operations Research and Management Science (NorthHolland: Amsterdam). Jegadeesh, Narasimhan, and Sheridan Titman, 1993, Returns to buying winners and selling losers: Implications for stock market efficiency, Journal of Finance 48, 65–91. Jensen, Michael, 1986, Agency costs of free cash flow, corporate finance, and takeovers, American Economic Review 76, 323–329. , and W.H. Meckling, 1976, Theory of the firm: Managerial behavior, agency costs, and ownership structure, Journal of Financial Economics 3, 305–360. Jensen, Michael, and Kevin Murphy, 1990, Performance pay and top management incentives, Journal of Political Economy 98, 225–264. Jung, Kooyul, Yong-Cheol Kim, and Ren´e Stulz, 1996, Investment opportunities, managerial discretion, and the security issue decision, Journal of Financial Economics 42, 159–185. Kadlec, Greg, and John McConnell, 1994, The effect of market segmentation and illiquidity on asset prices: Evidence from exchange listings, Jounral of Finance 49, 611–636. Kandel, S., and Robert Stambaugh, 1987, On correlations and inferences about mean-variance efficiency, Journal of Financial Economics 18, 61– 90. Kim, Yong-Cheol, and Ren´e Stulz, 1988, The Eurobond market and corporate financial policy: A test of the clientele hypothesis, Journal of Financial Economics 22, 189–205. Koh, and Walter, 1989, A direct test of Rock’s model of the pricing of unseasoned issues, Journal of Financial Economics 23, 251–272. Kothari, S., J. Shanken, and R. Sloan, 1995, Another look at the cross-section of expected returns, Journal of Finance 50, 185–224. Kyle, Albert S., 1985, Continuous auctions and insider trading, Econometrica 50, 1315–1335.
200
BIBLIOGRAPHY
, 1989, Informed speculation with imperfect competition, Review of Economic Studies 56, 317–356. Lang, Larry, Ren´e Stulz, and Ralph Walkling, 1989, Managerial perfomance, Tobin’s q, and the gain from tender offers, Journal of Financial Economics 24, 137–154. Lehn, Kenneth, and Annette Poulsen, 1989, Free cash flow and stockholder gains in going private transactions, Journal of Finance 44, 771–787. Litzenberger, Robert, and Krishna Ramaswamy, 1979, The effect of personal taxes and dividends on capital asset prices: Theory and evidence, Journal of Financial Economics 7, 163–196. Loderer, Claudio, John Cooney, and Leonard VanDrunen, 1991, The price elasticity of demand for common stock, Journal of Finance 46, 621–651. Longstaff, Francis, and Eduardo Schwartz, 1992, Interest rate volatility and the term structure: A two-factor general equilibrium model, Journal of Finance 47, 1259–1282. Loughran, and Ritter, 1995, The new issues puzzle, Journal of Finance 50, 23–51. Lucas, Robert, 1978, Asset prices in an exchange economy, Econometrica 46, 1429–1445. , 1982, Interest rates and currency prices in a two=country world, Journal of Monetary Economics 10, 335–360. MacKinlay, A. Craig, 1987, On multivariate tests of the CAPM, Journal of Financial Economics 18, 341–371. Manne, Henry G., 1965, Mergers and the market for corporate control, Journal of Political Economy 73, 110–120. Mark, Nelson, 1988, Time-varying betas and risk premia in the pricing of forward foreign exchange contracts, Journal of Financial Economics 22, 335–354. Markowitz, Harry, 1959, Portfolio Selection: Efficient Diversification of Investments (Wiley: New York).
BIBLIOGRAPHY
201
Marshall, J. M., 1974, Provate incentives and information, American Economic Review 64, 373–390. Masulis, Ronald, 1980, The effects of capital structure change on security prices, Journal of Financial Economics 8, 139–178. May, Don, 1995, Do managerial motives influence firm risk reduction strategies?, Journal of Finance 50, 1291–1308. McConnell, John, and Eduardo Schwartz, 1992, The origin of LYONS: A case study in financial innovation, Journal of Applied Corporate Finance pp. 40–47. Merton, Robert, 1987, A simple model of capital market equilibrium with incomplete information, Jounral of Finance 42, 483–510. Merton, Robert C., 1973, An intertemporal capital asset pricing model, Econometrica 41, 867–887. Mikkelson, Wayne, and Megan Partch, 1986, Valution effects of security offerings and the issuance process, Journal of Financial Economics 15, 31–60. Milgrom, P., and N. Stokey, 1982, Information, trade, and common knowledge, Journal of Economic Theory 26, 17–27. Miller, Merton, 1977a, Debt and taxes, Journal of Finance 32, 261–276. , 1977b, Risk, uncertainty, and divergence of opinion, Journal of Finance 32, 1151–1168. , and Kevin Rock, 1985, Dividend policy under asymmetric information, Journal of Finance 40, 1030–1051. Mitchell, Mark, and Kenneth Lehn, 1990, Do bad bidders become good targets?, Journal of Political Economy 98, 372–398. Mitchell, Mark L., and J. Harold Mulherin, 1996, Impact of industry shocks on takeover and restructuring activity, Journal of Financial Economics 41, 193–229. Morck, R.A., Andrei Shleifer, and Robert Vishny, 1988, Management ownership and market valution: An empirical analysis, Journal of Financial Economics 20, 293–315.
202
BIBLIOGRAPHY
Murphy, Kevin, 1985, Corporate performance and managerial remuneration: An empirical analysis, Journal of Accounting and Economics 7, 11–42. Myers, Stewart, 1977, Determinants of corporate borrowing, Journal of Financial Economics 5, 147–175. , 1984, The capital structure puzzle, Journal of Finance 39, 575–592. , and N. Majluf, 1984, Corporate financing and investment decisions when firms have information that investors do not have, Journal of Financial Economics 13, 187–221. Ofer, Aharon, and Ashok Natarajan, 1989, Convertible call policies: An empirical analysis of an information-signalling hypothesis, Journal of Financial Economics 19, 91–108. Opler, Tim, and Sheridan Titman, 1995, The debt-equity choice: An analysis of issuing firms, Working Paper. Pearson, Neal, and Tong Sheng Sun, 1994, Exploiting the conditional density in estimating the term structure: An application to the Cox, Ingersoll, and Ross model, Journal of Finance 49, 1279–1304. Prabhala, N. R., 1993, On interpreting dividend announcement effects: Free cash flow, clientele, or signalling?, Yale Working Paper. Puri, Manju, 1996, Commercial banks in investment banking: Conflict of interest or certification role?, Journal of Financial Economics 40, 373– 401. Rajan, Raghuram, 1996, Insiders and outsiders: The choice between informed and arm’s length debt, Jounrnal of Finance 47, 1367–1400. , and Henri Servaes, ????, The effect of market conditions on initial public offerings, . Rajan, Raghuram, and Luigi Zingales, 1995, What do we know about capital structure? some evidence from international data, Journal of Finance 50, 1421–1460. Reisman, H., 1992, Reference variables, factor structure, and the approximate multibeta representation, Journal of Finance 47, 1303–1314.
BIBLIOGRAPHY
203
Rock, Kevin, 1986, Why new issues are underpriced, Journal of Financial Economics 15, 187–212. , 1989, The specialist’s order book, Unpublished Working Paper. Roll, Richard, 1977, A critique of the asset pricing theory’s tests, Journal of Financial Economics 4, 129–176. , 1984, A simple measure of the effective bid/ask spread in an efficient market, Journal of Finance 39, 1127–1139. , 1986, The hybris hypothesis of corporate takeovers, Journal of Business 59, 197–216. , and Stephen Ross, 1994, On the cross-sectional relation between expected returns and betas, Journal of Finance 49, 101–122. Ross, Stephen, 1976, The arbitrage theory of capital asset prices, Journal of Economic Theory 13, 341–360. , 1977a, The determination of financial structure: The incentive signalling approach, Bell Jounrnal of Economics 8, 23–40. , 1977b, Return, Risk, and Arbitrage . , vol. Risk and Return in Finance, I (Ballinger: Cambridge, MA). Shanken, Jay, 1982, The arbitrage pricing theory: Is it testable?, Jounal of Finance 37, 1129–1140. , 1985, Multivariate tests of the zero-beta CAPM, Journal of Financial Economics 14, 327–348. Shanken, J., 1987, Multivariate proxies and asset pricing relations: Living with the Roll critique, Journal of Financial Economics 18, 91–110. , 1992, On the estimation of beta-pricing models, Review of Financial Studies 5, 1–34. Shanken, Jay, and M. Weinstein, 1990, Macroeconomic variables and asset pricing: Esstimation and tests, Working Paper, University of Rochester. Shiller, Robert J., 1981, Do stock prices move too much to be justified by susequent changes in dividends?, American Economic Review 71, 421–436.
204
BIBLIOGRAPHY
Shin, Hyun-Han, and Ren´e Stulz, 1996, An analysis of divisional investment policy, NBER Working Paper. Shleifer, and Vishny, 1986, Large shareholders and corporate control, Journal of Political Economy 94, 461–488. , 1992, Liquidation values and debt capacity: A market equilibrium approach, Journal of Finance 47, 1343–1366. Shleifer, Andrei, 1986, Do demand curves for stock slope down?, Jounrnal of Finance 41, 579–590. Slezak, Steve, 1994, A theory of the dynamics of security returns around market closures, Journal of Finance 49, 1163–1211. Sloan, Richard, 1993, Accounting earings and top executive compensation, Journal of Accounting and Economics 16, 55–100. Smith, Clifford, 1986, Investment banking and the capital acquisition process, Journal of Financial Economics 15, 3–29. , and Ross Watts, 1992, The investment opportunity set and corporate financing, dividend and compensation policies, Journal of Financial Economics 32, 263–292. Snow, Karl, 1991, Diagnosing asset pricing models using the distribution of asset returns, Journal of Finace 46, 955–983. Spence, Michael, 1973, Job market signaling, Quarterly Journal of Economics pp. 355–374. , 1974, Competitive and optimal responses to signals: An analysis of efficiency distribution, Journal of Economic Theory 7, 296–332. Stambaugh, Robert, 1982, On the exclusion of assets from tests of the two parameter model, Journal of Financial Economics 10, 235–268. Stein, Jeremy, 1992, Convertible bonds as backdoor equity financing, Journal of Financial Economics 32, 3–21. Stulz, Ren´e, 1988, Managerial control of voting rights: Financing policies and the market for corporate control, Journal of Financial Economics 20, 25–54.
BIBLIOGRAPHY
205
, 1995, Rethinking risk management, Working paper. Titman, Sheridan, and Robert Wessels, 1988, The determinants of capital structure choice, Journal of Finance 43, 1–19. Tufano, Peter, 1989, Financial innovation and first mover advantages, Journal of Financial Economics 25, 213–240. , 1996, Who manages risk? an emprical examination of risk management practices in the gold mining industry, Journal of Finance 51, 1097–1137. Vermaelen, Theo, 1981, Common stock repurchases and market signalling: An empirical study, Journal of Financial Economics 9, 138–183. Weiss, Lawrence, 1990, Bankruptcy resolution: Direct costs and violation of priority of claims, Journal of Financial Economics 27, 285–314. Welch, Ivo, 1992, Sequential sales, learning, and cascades, Journal of Finance 47, 695–732. Yermack, David, 1995, Do corporations award stock options effectively?, Journal of Financial Economics 39, 237–269. Zender, Jaime, 1991, Optimal financial instruments, Journal of Finance 46, 1645–1663.